A Research on the Text Incremental Clustering Based on Cluster Features
PAN Min;WANG Ming-wen;WANG Xiao-qing;JIE An-quan;College of Computer Information Engineering,Jiangxi Normal University;
A text incremental clustering algorithm based on cluster features has been presented. Firstly,initial clustering is performed by making full use of simple and efficient k-means algorithm. Secondly,the clustering center, mean,variance,the number of document,the third central moment and the fourth central moment are saved as the cluster features of each cluster. Finally,when new documents occur,they are incrementally clustered with those cluster features. The experimental results on 20newsgroups data set demonstrate that the algorithm the paper presents has some advantages.