Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Journal of Beijing Institute of Graphic Communication》 2018-03
Add to Favorite Get Latest Update

Research on Data Crawler of Electric Business Books Based on Python

JIN Zhenjie;CAO Shaozhong;XIANG Hongfeng;WANG Mingdao;LI Xinpei;Beijing Institute of Graphic Communication;  
With the rapid development of the internet, the online mall has become the main consumption pattern in our daily life. If people want to buy some books about computer, for example, to clearly understand related information about the various types of books become a demand. In order to solve this demand, we make a research about a kind of simulated landing browser and web page analysis technology based on the Scrapy crawler framework of Python language. And program stores the acquired book's information into the Mongo DB database or local hard drive for subsequent data analysis. The implementation of the reptile program programming is simple,stable performance, and can effectively obtain electricity business book's data.
【Fund】: 国家自然基金(61472461);; 国家重大科学仪器设备开发专项(2013YQ140517)
【CateGory Index】: TP311.13
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved