Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Computer Science》 2009-08
Add to Favorite Get Latest Update

Survey of High-performance Web Crawler

ZHOU De-mao LI Zhou-jun(School of Computer Science and Engineering,Beihang University,Beijing 100191,China)  
Web Crawlers,one of basic components of Search Engine,are programs to download resources from Internet.We illuminated the work theory of the Web Crawlers,and its development,and how to design a high-performance,scalable,distributed Web crawler,including the faced key problem.
【Key Words】: Crawler High-performance Scalability
【Fund】: 国家自然科学基金项目(60573057 90718017)资助
【CateGory Index】: TP391.3
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
【References】
Chinese Journal Full-text Database 2 Hits
1 PENG Geng,FAN Ming-yu(School of Computer Science & Engineering,University of Electronic Science & Technology of China,Chengdu 610054,China);SQL injection detection based on improved Web crawler[J];Application Research of Computers;2010-07
2 JIN Chan-ming,XU Dong-ping(College of Computer Science and Technology,Wuhan University of Technology,Wuhan 430063);Research on Web Crawler Module of Search Engine[J];Modern Computer;2010-03
【Citations】
Chinese Journal Full-text Database 9 Hits
1 ZHANG Min 1, GAO Jian-Feng 2, and MA Shao-Ping 1 1(State Key Laboratory of Intelligent Technology & System, Tsinghua University, Beijing 100084) 2(Microsoft Research Asia, Beijing 100080);Anchor Text and Its Context Based Web Information Retrieval[J];Journal of Computer Research and Development;2004-01
2 Zhang Sanfeng and Wu Guoxin (Key Laboratory of Computer Network and Information Integration of Ministry of Education, Southeast University, Nanjing 210096);A Fault-Tolerant Asymmetric DHT Method Towards Dynamic and Heterogeneous Network[J];Journal of Computer Research and Development;2007-06
3 Yu Jin Shi Shuming(Department of Computer Science and Technology,Tsinghua University,Beijing100084);Algorithms for Distributed Page Ranking and Analysis on Transmission Mode[J];Computer Engineering and Applications;2004-29
4 WAN Yuan,WAN Fang,WANG Da-zhen (School of science, Wuhan University of Technology, Wuhan 430070, China; Faculty of Computer Science,Hubei University of Technology,Wuhan 430068,China);URL Scheduling Algorithm in Parallel Crawler System[J];Computer Engineering and Applications;2006-S1
5 ZHOU Xue-Zhong WU Zhao-Hui (Dept. of Computer,Zhejiang University,Hangzhou 310027) Email: {zxz jwzh}@cs. zju. edu. en;Knowledge Discovery in Text: A Survey[J];Computer Science;2003-01
6 HE Guangyi, LUO Li(Center of Education Technology, Jinan Army Academy, Jinan Shandong 250029, Chin a);Design and Implementation of Distributed Web Search Engine[J];Computer Applications;2003-05
7 LI Xiao-Ming+, FENG Wang-Sen (Department of Computer Science and Technology, Peking University, Beijing 100871, China);Two Effective Functions on Hashing URL[J];Journal of Software;2004-02
8 JIANG Zong-li,ZHAO Qin,XIAO Hua,WANG Rui(Beijing University of Technology,Beijing 100022,China);High performance parallel crawler[J];Computer Engineering and Design;2006-24
9 SHEN He-dan~1,PAN Ya-nan~2,SHAO Liang-shan~1 (1.System Engineering Research Institute,Liaoning Technical University,Fuxin 123000,China;2.Liaoning University of Petroleum and Chemical Technology,Fushun 113001,China);A Study for Search Engine[J];Computer Technology and Development;2006-04
【Co-citations】
Chinese Journal Full-text Database 10 Hits
1 SUN Tieli, YANG Fengqin(Department of Computer Science,Northeast Normal University,Changchun 130024,China);An approach of building and updating user interest profile according to the implicit feedback[J];Journal of Northeast Normal University (Natural Science Edition);2003-03
2 ZHAO Xu, CHEN Xiao-fei (Information Technology Center Three Gorges University, Yichang 443002, China);Research of FTP Search Engine Technology Based on Web[J];Computer Knowledge and Technology(Academic Exchange);2007-22
3 ZHOU Xiang (Tongji University Software College,Shanghai 200000,China);Research And Improvement of Network Reptile Based on Websphinx[J];Computer Knowledge and Technology;2008-28
4 CAO Zhong1,ZHAO Wen-jing2 (1.College of Computer and Educational Software,Guangzhou University,Guangzhou 510006,China;2.Center of Experiment,Guangzhou University,Guangzhou 510006,China);Design and Implementation of a optimized Web-Crawler[J];Computer Knowledge and Technology;2008-35
5 PENG Ye-ping(Information Management and Engineering College,Jishou University,Zhangjiajie 427000,China);Personalized Information Retrieval System Model of User Interest[J];Computer Knowledge and Technology;2009-20
6 LIU Xin,YU Hong,YIN Xiang-gui,LIU Xi-jing,WANG Chun-yong(School of Information Engineering,Dalian Fisheries University,Dalian Liaoning 116023,China);Research and Implementation of Fisheries Information Retrieval System Based on Ontology[J];Equipment Manufacturing Technology;2007-06
7 WU Cui-yan,HUANG Jian-bo,LI Hao,YUAN Hua(Communication and Computer Network Key Lab of Guangdong,School of Computer Science and EngineeringSouth China University of Technology,Guangzhou 510640,China);Domain Name Solution Strategy Based on Aggressive Hash and Multi-Cache[J];Journal of Guangxi Normal University(Natural Science Edition);2009-01
8 Fu Tao Dai Yugang Zhou Deng(Department of China Minorities Information Technology Institute,Northwest University for Nationalities,Lanzhou 730030,Gansu);The Application of Link Analysis in Topic Information Retrieval System[J];Computer & Telecommunication;2009-01
9 YI Qing-liang,LIU Ke-jian,CAI Zhu-lian(School of Mathematics and Computer Engineering,Xihua University,Chengdu Sichuan 610039,China);Large Scale FTP Searching Engine Based on P2P and Distributed Technology[J];Journal of Guangxi Normal University(Natural Science Edition);2010-01
10 LIU Tong-tong,WU Xiao-qin(College of Information Science & Technology,Hainan University,Haikou 570228,China);PageRank algorithm integrating the authority and relevance[J];Information Technology;2008-11
China Proceedings of conference Full-text Database 3 Hits
1 ;A Hash Algorithm Based on High-speed Link[A];[C];2006
2 Yao Shuyu Zhao Shaodong Department of Computer Science, Zhongshan University, Guangzhou Guangdong 510275, China;A SEARCH ENGINE USING DISTRIBUTED TECHNOLOGY[A];[C];2005
3 Liu Li Xiao Shibin Wang Tao Shi Shuicai (Chinese Information Processing Research Center,Beijing Information Science & Technology University,Beijing 100101,China);A Design of Distributed Weblog Search Engine Based on RSS[A];[C];2007
【Co-references】
Chinese Journal Full-text Database 10 Hits
1 Zhang Qiang-gong Yu Guo-bao Liao Hu-sheng Sui Shu-lin (College of Computer Science, Beijing Univ. of Tech. , Beijing 100022,China; Sifang College, Qingdao Univ. of Science and Tech. , Qingdao 266042, Shandong, China);A Processing Model for the Query Results Obtained by Meta-search Engines[J];Journal of South China University of Technology(Natural Science);2004-S1
2 Wen Kunmei Lu Zhengding Deng Xi Chen Li Wen Kunmei Postgraduate; College of Computer Sci. & Tech., Huahzong Univ. of Sci. & Tech., Wuhan 430074, China.;An optimal for ranking results of web search in metasearch[J];Journal of Huazhong University of Science and Technology;2003-03
3 WANG Ji Cheng, XIAO Rong, SUN Zheng Xing, and ZHANG Fu Yan (Department of Computer Science and Technology, Nanjing University, Nanjing 210093) (State Key Laboratory for Novell Software Technology, Nanjing University, Nanjing 210093);STATE OF THE ART OF INFORMATION RETRIEVAL ON THE WEB[J];Journal of Computer Research and Development;2001-02
4 Lin Tong; Jiang Zhijun (Department of Computer,Civil Aviation institute of China,Tianjin 300300);The Internet Search Engine[J];COMPUTER ENGINEERING AND APPLICATIONS;2000-05
5 CHEN Xiao-bing1,ZHANG Han-yu2,LUO Li-ming2,HUANG He1 1.College of Software,Beihang University,Beijing 100083,China 2.College of Information Engineering,Capital Normal University,Beijing 100081,China;Research on technique of SQL injection attacks and detection[J];Computer Engineering and Applications;2007-11
6 ZHOU Li-zhu,LIN Ling(Department of Computer Science and Technology,Tsinghua University,Beijing 10084,China);Survey on the research of focused crawling technique[J];Computer Applications;2005-09
7 YIN Jiang,YIN Zhi-ben,HUANG Hong(School of Information Science and Technology,Southwest Jiaotong University,Chengdu Sichuan 610031,China);Efficiency bottlenecks analysis and solution of Web crawler[J];Journal of Computer Applications;2008-05
8 ZHAO Ting,LU Yu-liang,LIU Jin-hong,SUN Hong-gang,SHI Fan(Network Engineering Laboratory,Electronic Engineering Institute,Hefei 230037);Web Vulnerability Detection Based on Form Crawler[J];Computer Engineering;2008-09
9 YUAN Fu-yong,LIANG Shun-pan(College of Information Science and Engineering,Yanshan University,Qinhuangdao 066004,China);Current status and development of meta search engine[J];Computer Engineering and Design;2005-12
10 SHEN He-dan~1,PAN Ya-nan~2,SHAO Liang-shan~1 (1.System Engineering Research Institute,Liaoning Technical University,Fuxin 123000,China;2.Liaoning University of Petroleum and Chemical Technology,Fushun 113001,China);A Study for Search Engine[J];Computer Technology and Development;2006-04
【Secondary Citations】
Chinese Journal Full-text Database 4 Hits
1 LEI Ming\ WANG Jianyong\ ZHAO Jianghua\ SHAN Songwei\ CHEN Baojue (Department of Computer Scinece & Technology,Peking University,Beijing,100871);The 3~(rd) Generation Search Engine and WebGather Version 2.0[J];Acta Scicentiarum Naturalum Universitis Pekinesis;2001-05
2 ZIIU Jun-qing (Library of Guangzhou University, Guangzhou 510405, China);A Study on the Search Engine of Google[J];Journal of Guangzhou University;2001-11
3 HU Ran (Archives College of People University,Beijing 100872,China);On Several Theoretical Problems of Search Engine[J];Shan Xi Library Journal;2003-01
4 LI Xiao-Ming+, FENG Wang-Sen (Department of Computer Science and Technology, Peking University, Beijing 100871, China);Two Effective Functions on Hashing URL[J];Journal of Software;2004-02
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved