Sample and rule centric approach for associative classification on imbalanced data

被引:0
|
作者
Yang G. [1 ]
Cui X. [1 ]
Zhang X. [1 ]
机构
[1] Institute of Systems Engineering, Dalian University of Technology, Dalian
基金
中国国家自然科学基金;
关键词
Associative classification; Imbalanced data; Rule centric processing; Sample centric processing;
D O I
10.12011/1000-6788(2017)04-1035-11
中图分类号
学科分类号
摘要
The emergency of imbalanced data has brought a great challenge for AC method. To improve AC's performance on imbalanced data, this paper presents key value sampling (KVS) method and rule validation (RV) method respectively from data and rule processing. KVS samples the original imbalanced data and achieves class balance by removing the instances weakly correlated with majority class and increasing those strongly correlated with minority class, which can prevent a lot of useful information from losing and highlight the useful information related with minority class. RV method is to validate the initially generated classifier and improve the rules with bad performances, which can enhance the whole classifier's performance. Through experiment analysis, the methods in this paper can improve the performance of AC on imbalanced data classification. © 2017, Editorial Board of Journal of Systems Engineering Society of China. All right reserved.
引用
收藏
页码:1035 / 1045
页数:10
相关论文
共 23 条
  • [1] Zhang Y.Y., Guo H.P., Fan M., Multi-label classification by exploiting relationship of labels, Journal of Computer Research and Development, 48, pp. 16-21, (2011)
  • [2] Tsoumakas G., Katakis I., Vlahavas I., Mining Multi-label Data, Data Mining and Knowledge Discovery Handbook, pp. 667-685, (2010)
  • [3] Han J., Kamber M., Pei J., Data Mining: Concepts and Techniques, (2011)
  • [4] Liu B., Hsu W., Ma Y., Integrating classification and association rule mining, Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, (1998)
  • [5] Li W., Han J., Pei J., CMAR: Accurate and efficient classification based on multiple class-association rules, Proceedings of IEEE International Conference on Data Mining (ICDM2001), pp. 369-376, (2001)
  • [6] Yin X., Han J., CPAR: Classification based on predictive association rules, Proceedings of the SIAM International Conference on Data Mining (SDM), 3, pp. 369-376, (2003)
  • [7] Wang J., Karypis G., HARMONY: Efficiently mining the best rules for classification, Proceedings of the SIAM International Conference on Data Mining (SDM), 5, pp. 205-216, (2005)
  • [8] Simon G.J., Kumar V., Li P.W., A simple statistical model and association rule filtering for classification, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 823-831, (2011)
  • [9] He H., Garcia E., Learning from imbalanced data, IEEE Transactions on Knowledge and Data Engineering, 21, 9, pp. 1263-1284, (2009)
  • [10] Japkowicz N., The class imbalance problem: Significance and strategies, Proceedings of the International Conference on Artificial Intelligence, (2000)