Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Journal of Data Acquisition and Processing》 2011-01
Add to Favorite Get Latest Update

Density-Based Clustering Algorithm for Hybrid Coding Detection in Search Engines

Zhang Cheng,Zhang Qifei,Pan Xuezeng,Zhu Xuhui(1.College of Computer Science and Technology,Zhejiang University,Hangzhou,310027,China;2.73610 Unit.of PLA,Nanjing Military Region,Nanjing,210018,China)  
Aimed at Chinese HTML hybrid coding documents on the internet,this paper studies the character encoding composition of Chinese HTML files and clusters the contents of the hybrid coding files.The HTML files are separated into several categories using the classical data mining algorithms DBSCAN.Then,based on feature encoding each class is detected,after clustering hybrid encoding files.Experimental results show that when selecting the appropriate parameters each class in line with the Chinese character encoding features reaches 100%.The method can be used in the field of search engines.
【Fund】: 国家支撑计划(2008BAH21B03)基金项目;; 浙江省公益性技术应用研究计划(2010C31003)基金项目
【CateGory Index】: TP391.3
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
Chinese Journal Full-text Database 7 Hits
1 Li Jifeng Liu Qun(Institute of Computing Technology,Chinese Academy of Sciences,Beijing100080);N-Gram Based High Speed Chinese Encoding Recognizing System[J];Computer Engineering and Applications;2004-03
2 Zhu Jia Li Shenghong Li Jianhua(Department of Electronic Engineering,Shanghai Jiaotong University,Shanghai 200030)(Institution of Information Security,Shanghai Jiaotong University,Shanghai 200030);Chinese Encoding Charsets Blind Identification Algorithm for E-mail Content Filtering[J];Computer Engineering and Applications;2005-10
3 LI Pei-feng 1,2 ,ZHU Qiao-ming 1, QIAN Pei-de 1 (1.Computer Science and Technology School,Suzhou University, Suzhou, Jiangsu 215006,China; 2.Department of Computer Science and Engineering, Southeast University, Nanjing, Jiangsu 210000, China);Research of Han Character Internal Codes Recognition Algorithm in the Multi-lingual Environment[J];Journal of Chinese Information Processing;2004-02
4 XUE Li-xiang,QIU Bao-zhi (School of Information Engineering,Zhengzhou University,Zhengzhou 450052);Density-reachable Based Clustering Algorithm for Multi-density[J];Computer Engineering;2009-17
5 XIN Chun sheng, SUN Yu fang(Institute of Software, The Chinese Academy of Sciences, Beijing 100080, China) E mail: yfsun@sonata.iscas.ac.cn; http://www.ios.ac.cn;Design and Implementation of a Simplified Unsimplified Chinese Character Conversion System[J];Journal of Software;2000-11
6 MA Shuai1+, WANG Teng-Jiao1, TANG Shi-Wei1,2, YANG Dong-Qing1, GAO Jun1 1(Department of Computer Science and Technology, Peking University, Beijing 100871, China) 2(National Laboratory on Machine Perception, Peking University, Beijing 100871, China);A Fast Clustering Algorithm Based on Reference and Density[J];Journal of Software;2003-06
7 WANG Xin~1,WANG Hong-guo~2,WANG Jun~2,WANG Jin-zhi~3 (1.Management School of Shandong Normal University,Jinan 250014,China;2.Information and Management School of Shandong Normal University,Jinan 250014,China;3.Ocean School of Yantai University,Yantai 264005,China);Comparison of Clustering Methods in Data Mining[J];Computer Technology and Development;2006-10
Chinese Journal Full-text Database 10 Hits
1 LI Yu-jian (Beijing Municipal Key Laboratory of Multimedia and Intelligent Software Technology,College of Computer Science and Technology,Beijing University of Technology,Beijing 100022,China);Adaptive Clustering Algorithm Based on Minimal Spanning Tree Cutting[J];Journal of Beijing University of Technology;2007-03
2 GAO Bo(Yanling School,Changzhou Institute of Technology,Changzhou 213002);Generation Algorithm of Professional Information Database Based on Corpus Statistics Tree[J];Journal of Changzhou Institute of Technology;2009-Z1
3 WANG An-zhi, LI Ming-dong, LI Chao(Computer College, China West Normal University, Nanchong 637002, China);Research Of Various Clustering Algorithm And Its Improved Algorithm[J];Computer Knowledge and Technology;2008-25
4 CHEN Dong 1,2, PI De-chang1 ( 1. College of information science and technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China; 2. College of information science and engineering, Nanjing University of Technology, Nanjing 210009, China);Improved K-Means Algorithm based on the Attributes Weighted[J];Computer Knowledge and Technology;2009-09
5 HU Wen-shun1LUO Jian-fu1,2HUANG Ai-ping1,3XU Qi-zhi1,2LI Tao1,2WEI Xiu-qing1,2ZHANG Li-jie1,2ZHENG Shao-quan1,2(1 Fujian Fruit Breeding Engineering Technology Research Center for Longan&Loquat,Fuzhou 350013,Fujian China;2 Fruit Research Institute,Fujian Academy of Agricultural Science,Fuzhou 350013,Fujian China;3 Institute of Agricultural Ecomomics and Information,Fujian Academy of Agricultural Science);Study on System Clustering Method in Numerical Taxonomy of Loquat[J];Fujian Fruits;2009-02
6 SUN Ying-juan1,SUN Ying-hui2,PU Dong-bing3(1.College of Computer Science and Technology,Changchun Normal University,Changchun 130032,China;2.College of Computer Science,Jilin Normal University,Siping 136000,China;3.School of Computer Science and Information Technology,Northeast Normal University,Changchun 130117,China);Research on the Application of Fuzzy C-means Clustering Algorithm in Recognition[J];Journal of Changchun Normal University(Natural Science);2010-10
7 CHEN Jian-wei,CAI Jiang-xiong,XU Li(School of Mathematics and Computer Science,Fujian Normal University,Fuzhou 350108,China);Energy-efficient CLARA-based Group Key Management Scheme for Wireless Sensor Networks[J];Journal of Fujian Normal University(Natural Science Edition);2011-01
8 WU Qi-ming1,YI Yun-fei1,2 (1.Department of Computer and Information Science,Hechi University,Yizhou,Guangxi 546300;2.School of Computer Science,South-Central University for Nationalities,Wuhan,Hubei 430074,China);An Overview of Text Clustering[J];Journal of Hechi University;2008-02
9 CHEN Zhuo~1,MENG Qing-chun~(1,2,1,2),WEI Zhen-gang~1,REN Li-jie~1,DOU Jin-feng~1(1.Dept.of Computer Science,Ocean University of China,Qingdao 266071,China;2.State Key Lab of Intelligent Technology and Systems,Tsinghua University,Beijing 100084,China);A fast clustering algorithm based on grid and density condensation point[J];Journal of Harbin Institute of Technology;2005-12
10 Dai Wei-di1 Zhang Lu2 Wang Wen-jun1 Hou Yue-xian1(1.School of Computer Science and Technology,Tianjin University,Tianjin 300072,China;2.School of Software,Tianjin University,Tianjin 300072,China);Clustering Algorithm Based on Grid Density and Distance Information Characteristics[J];Journal of South China University of Technology(Natural Science Edition);2009-04
China Proceedings of conference Full-text Database 6 Hits
1 Yu Jian Chen Zijun Li Xia Li Wei (Department of Computer Science,Information Institute,Yanshan University,Qinhuangdao HeBei 066004);A New Clustering Algorithm for Multi-density[A];[C];2008
2 Liu Tong~1 Sun Yongxiang~2 Zhang Zhenhong~3 (1 Dept of Information,Shandong University of Science and Technology,Tai'an 271019,China; 2 Institute of Information Science and Engineering,Shandong Agricultural University,Tai'an 271018,China; 3 Section of Personnel,Shandong University of Science and Technology,Tai'an 271019 ,China);Effective clustering algorithm based on density and hierarchical[A];[C];2007
3 KANG Wei-xian YE De-qian ICDZ, Yanshan University, Qinhuangdao 066004, China;Study of CURE Based Clustering Algorithm[A];[C];2007
4 Gong Caichun~1 Zhang Huaping~2 Xu Hongbo~3 Cheng Xueqi~4 Bai Shuo~5 Institute of Computing Technologies,Chinese Academy of Sciences,Beijing,1000080;An Efficient Code Recognizing Algorithm for Short Text Streams[A];[C];2007
5 ;A Fast Clustering Based on Potential Energy[A];[C];2005
6 LAI Tao-tao FENG Shao-rong ZHANG Dong-zhan (Department of Computer Science,Xiamen University,Xiamen 361005,China);New Fast Clustering Algorithm Based on Partition and Density[A];[C];2008
【Secondary Citations】
Chinese Journal Full-text Database 10 Hits
1 Lin Yaping Li Yan Tong Tiaosheng Yin Feng(Dept of Computer Science,Hunan Univ,410082,Changsha,P R China);Research on Neural Network Technology in Chinese Word Separation[J];JOURNAL OF HUNAN UNIVERSITY(NATURNAL SCIENCES);1997-06
2 WANG Yong Cheng, SHEN Zhou, and XU Yi Zhen (Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200030);IMPROVED ALGORITHMS FOR MATCHING MULTIPLE PATTERNS[J];Journal of Computer Research and Development;2002-01
3 Zhang Jian Li Sujian Liu Qun(P.O.Box2704,Software Department ,Beijing100080);Statistical N-gram Method Used in Machine Translation System[J];Computer Engineering and Applications;2002-08
4 Zheng Yi Wu Bin Shi Zhongzhi (Key Laboratory of Intelligent Information Processing,Institute of Computing Technology,Chinese Academy of Sciences,Beijing100080);A Concept Space Based Text Retrieval System[J];Computer Engineering and Applications;2002-12
5 ;Clustering Method in Data Mining[J];Computer Science;2000-04
6 LI Shengtao,ZHAO Zhangjie,YU Zhihua (Software Division, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080);Design and Realization of Focused Web Crawler[J];Computer Engineering;2003-17
7 LI Xue-Yong~1,TIAN Li-Jun~1,TAN Yi-Hong~1,OUYANG Liu-Bo~2,LI Guo-Hui~3 (1.Department of Mathmatics and Information Science, Changsha University, Changsha 410003, China; 2.Software School, Hunan University, Changsha 410082, China; 3.College of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China);A Web Spider's Searching Algorithm Based on Non-Greedy Policy[J];Computing Technology and Automation;2004-02
8 Yin Jianping(School of Computer Science,National University of Defense Technology);Automatic Word Segmentation Methodsfor the Chinese Language[J];COMPUTER ENGINEERING & SCIENCE;1998-03
9 QIU Bao-zhi,SHEN Jun-yi(School of Electronic and Information Engineering,Xi'an Jiaotong University,Xi'an 710049,China.);Grid-based and Extend-based Clustering Algorithm for Multi-density[J];Control and Decision;2006-09
10 QIU Bao-Zhi~(1,2) SHEN Jun-Yi~1 ~1(School of Electronic and Information Engineering,Xi'an Jiaotong University,Xi'an 710049) (School of Information and Engineering,Zhengzhou University,Zhengzhou 450052);Border-Processing Technique in Grid-Based Clustering[J];Pattern Recognition and Artificial Intelligence;2006-02
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved