Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Geomatics and Information Science of Wuhan University》 2009-08
Add to Favorite Get Latest Update

A Cross-step Word Segmentation Algorithm for Understanding Traffic Information Represented in Natural Chinese Language

LU Feng1 LIU Huanhuan1,2 CHEN Chuanbin1,3 (1 LREIS,Institute of Geographic Sciences and Natural Resources Research,CAS,A11 Datun Road,Beijing 100101,China)(2 College of Resources and Safety Engineering,China University of Mining and Technology,D11 Xueyuan Road,Beijing 100083,China)(3 Spatial Information Research Center,Fuzhou University,523 Gongye Road,Fuzhou 350002,China)  
A novel cross-step word segmentation algorithm is proposed to process real-time traffic information represented in natural Chinese in this paper,to meet the urgent need of real-time traveling information service,for dynamic traffic information. Considering the record length distribution of the word libraries depicting real-time traffic information,this algorithm sets corresponding steps of word segmentation for address,direction and event libraries,and improves the one step running of the string pointer in classical Chinese word segmentation to flexible multiple steps running,so as to aggregate possible Chinese words efficiently. A case study shows that the proposed algorithm runs 10 times faster than an improved MM algorithm,whilst keeping similar accuracy and robustness. The authors argued that the presented algorithm is greatly helpful to the automatic and intelligent processing of the real-time traffic information,and facilitate the development of travel information services.
【Fund】: 国家863计划资助项目(2006AA12Z209 2007AA12Z241);; 国家自然科学基金资助项目(40871184);; 中国科学院知识创新工程重点方向性资助项目(KZCX2-YW-308)
【CateGory Index】: TP391.1
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved