Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Journal of Chongqing University of Technology(Natural Science)》 2017-01
Add to Favorite Get Latest Update

Text Representation Based on Improved Vector Space Model

ZHANG Xiao-chuan;YU Xu-ting;ZHANG Yi-hao;College of Computer Science and Engineering,Chongqing University of Technology;  
Text representation transfers the readable text into computer-identifiable data structure,and it is a fundamental problem in text information processing field.As a text representation approach in Vector Space Model(VSM),tf-idf algorithm just considers the relevancy between term feature and document,but class.In order to solve this problem,the paper introduce the Chi-square concept of mathematical statistics,and propose a text representation algorithm——tf-idf-cθ.And the algorithm takes the term c value as a factor of a text representation,and c value measures the term distribution difference in classes,and also considers the term characteristic as θ value to produce the corresponding text representation based on the improved VSM.Last,it classifies short text using twoalgorithms above,and the experiment results show that the modified method is more effective,and partly solve the relevancy between term feature and class.
【Fund】: 国家自然科学基金资助项目(61502064);; 重庆市“121”科技支撑示范工程项目(cstc2014fazktjcsf40009)
【CateGory Index】: TP391.1
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
【Citations】
Chinese Journal Full-text Database 5 Hits
1 WANG You-hua;CHEN Xiao-rong;College of Computer Science and Technology,Guizhou University;;Improved Text Clustering Algorithm Based on Kolmogorov Complexity[J];计算机科学;2016-05
2 XU Tao;YU Hong-zhi;JIA Yang-ji;Key Lab of China's National Linguistic Information Technology,Northwest University for Nationalities;;Tibetan Document Representation Method Based on Improved Chi-squared Statistic[J];计算机工程;2014-06
3 LEI Jun-cheng1,2 HUANG Tong-cheng2 LIU Xiao-wen2(Institute of Computer and Communication Engineering,Changsha University of Sciences and Technology,Changsha 410076,China)1(Department of Information Engineering,Shaoyang University,Shaoyang 422000,China)2;Improved Text Feature Selection Method Based on Text Feature Weight[J];计算机科学;2012-07
4 SUN Shuang HE Liang YANG Jing GU Jun-zhong Institute of Computer Applications,East China Normal University,Shanghai 200062,P.R.China;An improved algorithm for weighting keywords in web documents[J];Journal of Shanghai University(English Edition);2008-03
5 ZHANG Xiao-Hui LI Ying CHANG Gui-Ran ZHAO Hong (Software Center, Northeastern University, Shenyang 110004);A Dynamic Vector Space Model for Internet News Textual Categorization[J];计算机科学;2004-06
【Co-citations】
Chinese Journal Full-text Database 6 Hits
1 CHEN Zhuang;YANG Chunyu;College of Computer Science and Engineering,Chongqing University of Technology;;Study on Text Categorization Technology for Supervision Engineering[J];重庆理工大学学报(自然科学);2017-10
2 ZHANG Xiao-chuan;YU Xu-ting;ZHANG Yi-hao;College of Computer Science and Engineering,Chongqing University of Technology;;Text Representation Based on Improved Vector Space Model[J];重庆理工大学学报(自然科学);2017-01
3 ZHANG Qun;WANG Hong-jun;WANG Lun-wen;Electronic Engineering Institute;;Short Text Clustering Algorithm Combined with Context Semantic Information[J];计算机科学;2016-S2
4 YU Junyang;CAO Shihua;ZHU Jun;FU Xianshu;ZHU Yanchao;Department of Informationalied Management, Zhejiang Entry-Exit Inspection and Quarantine Bureau;Qianjiang College, Hangzhou Normal University;;Application of business rule engine based on priority weight[J];计算机应用;2015-S1
5 Xu Dongdong;Wu Shaobo;School of Information and Communication Engineering,Beijing Information Science and Technology University;;An Improved TF-IDF Feature Selection Based on Categorical Description[J];现代图书情报技术;2015-03
6 Mao Taitian1 Zou Kai1 Mao Jing1 Zhou Jun2(1.Public Administration School,Xiangtan University,Xiangtan 411105,China;2.The Limited Liability Company of Machinery Manufacturing of Xikuangshan Lengshuijiang,Lengshuijian 417500,China);Maximum Support Tree Clustering Algorithm of Web Text Based on Fuzzy Sets[J];现代情报;2011-11
【Secondary Citations】
Chinese Journal Full-text Database 10 Hits
1 LV Chaozhen;JI Donghong;WU Feifei;Computer School of Wuhan University;;Short text classification based on expanding feature of LDA[J];计算机工程与应用;2015-04
2 MING Jun-ren(School of Information Management,Wuhan University,Wuhan 430072,China);Research on Text Clustering Model Based on Ontology Graph[J];情报科学;2013-02
3 ZHANG Yan-xia,ZHANG Ying-jun,PAN Li-hu,XIE Bin-hong,CHEN Li-chao(Institute of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China);Improved Concept Semantic Similarity Computation Method[J];计算机工程;2012-12
4 QIU Yun-fei1,2,WANG Wei1,LIU Da-you2,SHAO Liang-shan1(1.School of Software,Liaoning Technical University,Huludao Liaoning 125105,China;2.College of Computer Science & Technology,Jilin University,Changchun 130012,China);CHI feature selection method based on variance[J];计算机应用研究;2012-04
5 TANG Ya-yuan1,2,XU De-zhi2,LAI Ya2(1.Department of Computer and Communication Engineering,Hunan University of Science and Engineering,Yongzhou 425100,China;2.College of Information Science and Engineering,Central South University,Changsha 410083,China);Semantic Similarity Calculation Method Based on Concept Feature[J];计算机工程;2012-05
6 CHANG Peng1,2,FENG Nan1(1.School of Management,Tianjin University,Tianjin 300072,China; 2.Department of Information & Network Center,Tianjin University,Tianjin 300072,China);A Co-occurrence based Vector Space Model for Document Indexing[J];中文信息学报;2012-01
7 LI Xiang1,CAI Zangtai2,JIANG Wenbin1,LV Yajuan1,LIU Qun1(1.Key Laboratory of Intelligent Information Processing,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China;2.Computer College,Qinghai Normal University,Xining,Qinghai 810008,China);A Maximum Entropy and Rules Approach to Identifying Tibetan Sentence Boundaries[J];中文信息学报;2011-04
8 WANG Hui-qing,CHEN Jun-jie+(Institute of Computer and Software,Taiyuan University of Technology,Taiyuan 030024,China);Research of spectral clustering based on graph partition[J];计算机工程与设计;2011-01
9 Huang Hua-jun Tan Jun-shan Sun Xing-ming② (School of Computer and Information Engineering, Central and South University of Forestry and Technology, Changsha 410004, China) ②(School of Computer and Communication, Hunan University, Changsha 410082, China);On Steganalysis of Information in Tags of a Webpage Based on Higher-order Statistics[J];电子与信息学报;2010-05
10 TAI De-yi,WANG Jun(Key Laboratory of Machine Vision and Intelligence Control Technology,Hefei University,Hefei 230601);Improved Feature Weighting Algorithm for Text Categorization[J];计算机工程;2010-09
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved