Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Journal of Tsinghua University(Science and Technology)》 2013-05
Add to Favorite Get Latest Update

Spam filtering based on online ranking logistic regression

SUN Guanglu 1,QI Haoliang 2(1.School of Computer Science and Technology,Harbin University of Science and Technology,Harbin 150080,China;2.College of Computer Science and Technology,Heilongjiang Institute of Technology,Harbin 150050,China)  
Spam filtering is an important issue in Web information processing.Many machine learning methods are utilized to filter spam.Current researches transform the filtering problem into binary classification,in which the optimization target of the classification model is not consistent with 1-AUC,the usual evaluation measurement.The inconsistence results in the deviation of model optimization,which makes a bad influence on filtering results.In this study,spam filtering was transformed into the ranking model through the optimization oriented to 1-AUC with online ranking logistic regression model then proposed to tackle the deviation of the model ’ s score in the online learning module.TONE(train on or near error),re-sampling and weights update methods were used to promote the learning speed in online adjustment of model ’ s parameters.Experiments on open evaluation datasets show that the developed method is better than the traditional online logistic regression model with statistical significance.
【Fund】: 国家自然科学基金资助项目(60903083);; 黑龙江省新世纪人才项目(1155-ncet-008);; 教育部博士点新教师基金资助项目(20092303120005)
【CateGory Index】: TP393.098
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved