Full-Text Search:
Home|Journal Papers|About CNKI|User Service|FAQ|Contact Us|中文
《Journal of Taiyuan Normal University(Natural Science Edition)》 2013-01
Add to Favorite Get Latest Update

Impact Analysis of Classification Performance for Cross-Validation of Imbalance Spliting Data

Zhao Cunxiu 1 Wang Ruibo 2 Li Jihong 2(1.School of Mathematical Sciences,Shanxi University,Taiyuan 030006;2.Computer Center,Shanxi University,Taiyuan 030006,China)  
Cross-validation is widely used in the model generalization error estimation.In particular,the 2 fold cross-validation has been widely used in the classification model's comparison.Using 2 fold cross-validation method in the Logistic regression model and characteristics(independent variable) values are 0 or 1 when studing the model's performance.The results show that precision,recall rate,F value and the accurate rate of 2 fold cross-validation deviation estimation are minimum when the distribution of categories are same or similar in the 2 fold cross-validation,the estimation of deviation increases with the 2 fold cross-validation category difference.The estimation of model's performance is significant degraded when class distributions of 2 fold data sets diverge.Therefore,we should try to keep the distribution of each data category consistency with sample when using cross-validation segmentation data.
【CateGory Index】: TP18
Download(CAJ format) Download(PDF format)
CAJViewer7.0 supports all the CNKI file formats; AdobeReader only supports the PDF format.
©2006 Tsinghua Tongfang Knowledge Network Technology Co., Ltd.(Beijing)(TTKN) All rights reserved