Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Computer Engineering and Applications》 2010-03
Add to Favorite Get Latest Update

Research on algorithm of Chinese word automatic segmentation

HE Guo-bin,ZHAO Jing-lu College of Computer and Information Science,Southwest University,Chongqing 400715,China  
Chinese segmentation mechanism is analyzed.An improved structure of segmentation dictionary is presented,and in view of the characteristics of the mechanical Chinese word segmentation,combined with probabilistic algorithm,a Chinese Word Automatic Segmentation probabilistic algorithm is discussed.Hashing and binary search is used to segmentation match.Experiment indicates that the algorithm can greatly improve the speed of Chinese segmentation and precision,and strengthen the processing of dispelling ambiguity.
【Fund】: 发展基金资助项目(WEB信息智能获取算法的研究 西南大学计算机与信息学院)
【CateGory Index】: TP391.1
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
【Citations】
Chinese Journal Full-text Database 5 Hits
1 CHEN Gui Lin, WANG Yong Cheng, HAN Ke Song, and WANG Gang (Network Information Center, Shanghai Jiaotong University, Shanghai 200030);AN IMPROVED FAST ALGORITHM FOR CHINESE WORD SEGMENTATION[J];JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT;2000-04
2 Li Ronglu, Wang Jianhui, Chen Xiaoyun, Tao Xiaopeng, and Hu Yunfa (Department of Computing and Information Technology, Fudan University, Shanghai 200433);Using Maximum Entropy Model for Chinese Text Categorization[J];Journal of Computer Research and Development;2005-01
3 ZHAI Wei-bin,ZHOU Zhen-liu,JIANG Zhuo-ming,XU Rong-sheng Computing Center,Institute of High Energy Physics,CAS; Graduate School of the Chinese Academy of Sciences,Beijing 100049,China;Design dictionary of chinese word segmentation[J];Computer Engineering and Applications;2007-01
4 ZENG Hua-lin,LI Tang-qiu,SHI Xiao-dongDepartment of Computer Science,Xiamen University,Xiamen Fujian 361005,China);Segmentation algorithm for Chinese based on extraction of context information[J];Computer Applications;2005-09
5 Xiong Huixiang Xia Lixin;Key Chinese Full-Text Search Technologies Based on Word Indexing and Their Development Trends[J];Journal of Library Science In China;2007-04
【Co-citations】
Chinese Journal Full-text Database 10 Hits
1 FENG Zhea,b,SUN Ji-guia,b,ZHANG Chang-shenga,b,WANG Yana,b(a.College of Computer Science and Technology;b.Key Laboratory of Symbolic Computation and Knowledge Engineering for Ministry of Education,Jilin University,Changchun 130012,China);Research Advance of Chinese Speech Synthesis[J];Journal of Jilin University(Information Science Edition);2007-02
2 LIU Chun-hui, JIN Shun-fu, LIU Guo-hua, LI Ying (College of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China);A Chinese segmentation method based on optimization maximum matching and statistics[J];Journal of Yanshan University;2009-02
3 ;Solving Combinatorial Ambiguities in Chinese Word Segmentation Using Contextual Information[J];Computer Development & Applications;2007-01
4 Gao Feng et al;Illegitimate Contents Recognition based on Maximun Entropy Model[J];Computer Development & Applications;2009-01
5 Wei Xiao-ning (College of Computer Science&Technology Nantong University,Nantong 226019,China);HMM-Based Of Study On Chinese Language Classifying Words[J];Computer Knowledge and Technology(Academic Exchange);2007-21
6 GUO Yi(School of Software Engineering,Tongji University,Shanghai 201804,China);An Improved Mechanism on the Chinese Word Segmentation[J];Computer Knowledge and Technology;2008-07
7 LU Qiang,JIN Wei-zu (Tongji University,Shanghai 201804,China);Research on Two-level Word Segmentation Based on FMM & CRFs[J];Computer Knowledge and Technology;2008-28
8 QU Zhi-yi,LI Yi-wei,ZHANG Yan-tang,YANG Shu-guang,ZHANG Fei-fei (School of Information Science-Engineering,Lanzhou University,Lanzhou 730000,China);Maximum Entropy Text Classification Based on Key Duplication Semantic[J];Journal of Guangxi Normal University(Natural Science Edition);2007-04
9 YANG Lai1,2,HE Qing1,XU Li-da1,SHI Zhongzhi1 (1.Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100080,China;2.Graduate School of Chinese Academy of Sciences,Beijing 100039,China);Research and Analysis of Dynamic Hash TRIE Algorithm[J];Journal of Guangxi Normal University(Natural Science Edition);2008-01
10 CHEN Wen-qing~1 , LI Qin~2, YAO Jia-hua~3(1. Department of Teaching Technology, Zhanjiang Normal College, Zhanjiang 524048, China; 2. South China University of Technology, Network Engineering and Research Center, Guanzhou 510640,China;3. School of Mathematics and Computation Science, Zhanjiang Normal College,Zhanjiang 524048, China );The Spam Email Filter Technology Based on Maximum Entropy Modeling[J];Journal of Guangxi Teachers College;2005-01
China Proceedings of conference Full-text Database 9 Hits
1 Xiaodan Zhu, Qian Diao & Zhou Joe FIntel China Research Center;A Two-character Hash Function For Chinese Words[A];[C];2001
2 Yang Chao~1 Li Renfa~1 Jiang Bin~1 (1.College of Computer and Communication,Hunan University,Changsha 410082);A Kind of Highly Efficient Dictionary Mechanism for Chinese Word Segmentation[A];[C];2004
3 Gu Bo , Liu Kaiying School of Computer and Information Technology, Shanxi University, Taiyuan 030006;A Comparison of Decision Tree and Maximum Entropy in Chinese Text Classification[A];[C];2005
4 Li Junhui Zhu Qiaoming Li Peifeng School of Computer Science and Technology, Suzhou University, Suzhou 215006;Approach to Chinese Text Categorization Based on Maximum Entropy Model[A];[C];2005
5 WANG Suge, YANG Junling, ZHANG Wu, LI Deyu, PENG Qiwei School of Computer Engineering and Science, Shanghai University, Shanghai 200072; School of Mathematics Science, Shanxi University, Taiyuan 030006; School of Computer & Information Technology, Shanxi University, Taiyuan 030006;Maximum Entropy Model for Identifying Chinese Verb-Verb Collocation[A];[C];2006
6 ZHANG Wei, SUN Le, FENG Yuan-yong, LV Yuan-hua Open System and Chinese Information Processing Center, Institute of Software Chinese Academy of Sciences. Graduate University of Chinese Academy of Sciences. Beijing 100080. China;A New Chinese Input Method Combined With Classification Model[A];[C];2006
7 Wang Hu Wang Qianping China University of Mining and Technology,Jiangsu,Xuzhou 221008;An Improved Mechanism on the Whole Chinese Word binary Segmentation[A];[C];2007
8 Tianshengwei~1 Turgun Ibrahiml Yulong~2 Mahmut Muhammad~1 Hasan Uma~1 ~1(Information science and engineering technology institute,Xinjiang University,Urumqi 830046)~2(Net center,Xinjiang University,Urumqi 830046);A Weighted Hash Algorithm Based On Uyhur In EBMT[A];[C];2008
9 HUANG Yun-ping~(1,2)SUN Le~1 LI Wen-bo~(1,) ~1 Institute of Software,Chinese Academy of Sciences,Beijing 100190 ~2 Graduate University of the Chinese Academy of Sciences,Beijing 100049);Research on Graph-based Contextual Text Representation for Text Classification[A];[C];2008
【Secondary Citations】
Chinese Journal Full-text Database 10 Hits
1 Liang Nanyuan;THE KNOWLEDGE OF CHINESE WORDS AUTOMATIC SEGMENTATION[J];;1988-04
2 ;Auto-partition Algorithms of the Chinese Character Search Engine[J];Computer Development & Applications;2002-06
3 Liu Ting; Wu Yan; Wang Kaizu(Dept. of Computer, HIT);The Problem and Algorithm of Maximum Probability Word Segmentation[J];JOURNAL OF HARBIN INSTITUTE OF TECHNOLOGY;1998-06
4 GU Min 1,SHI Li ping 2,LI Chun ling 3 (1.The Communication Bank Harbin Branch,Heilongjiang Prov., Harbin 150008,China; 2.Heilongjiang Hydraulic Engineering College,Harbin 150086,China; 3.Heilongjiang Nenjiang River Diversion Engineering Managemen;The summary of automatic indexing[J];Journal of Heilongjiang Hydraulic Engineering College;2000-03
5 Ma Zhe Yao Min (College of Computer Science, Zhejiang Univ. , Hangzhou 310037, Zhejiang, China);An Improved PATRICIA-tree-based Dictionary Mechanism for Automatic Chinese Word Segmentation[J];Journal of South China University of Technology(Natural Science);2004-S1
6 LIN Qi - ping (Library of South China Normal University, Guangzhou 510631, China);A BEST PATH ALGORITHM FOR WORD FORM BASED CHINESE TEXT SEGMENGTATION[J];Journal of South China Normal University(Natural Science);2002-04
7 Zheng Yanbin (Department of Computer Science, Henan Normal University, 453002,Xinxiang);Printed Chinese Word Auto Segmentation and Ambiguous Phrases Analysing[J];JOURNAL OF HENAN NORMAL UNIVERSITY;1997-04
8 GUO Yan hua 1,ZhOU Chang le 2 (1.The School of Informatics engineering of Hangzhou Institute of Electronic Engineering Zhejiang, 3100372.Department of computer of computer of Zhejiang University Zhejiang,310037);The Summary of Natural Language Understanding[J];JOURNAL OF HANGZHOU INSTITUTE OF ELECTRONIC ENGINEERING;2000-01
9 Zhang Guoxuan Wang Xiaohua Zhou Bishui Hangzhou Institute of Electronic Engineering,310037;A Fast Automatic Word Segmentation System for Chinese Characters and Its Algorithm Design[J];Journal of Computer Research and Development;1993-01
10 CHEN Gui Lin, WANG Yong Cheng, HAN Ke Song, and WANG Gang(Network Information Center, Shanghai Jiaotong University, Shanghai 200030);A KIND OF HIGHLY EFFICIENT DATA STRUCTURE FOR CHINESE ELECTRONIC THESAURUS[J];JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT;2000-01
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved