Spam filtering based on online ranking logistic regression

被引:0
|
作者
机构
[1] Sun, Guanglu
[2] Qi, Haoliang
来源
Sun, G. (guanglu_sun@163.com) | 1600年 / Tsinghua University卷 / 53期
关键词
Binary classification - Classification models - Discriminative models - Logistic Regression modeling - Machine learning methods - On-line rankings - Spam - Statistical significance;
D O I
暂无
中图分类号
学科分类号
摘要
Spam filtering is an important issue in Web information processing. Many machine learning methods are utilized to filter spam. Current researches transform the filtering problem into binary classification, in which the optimization target of the classification model is not consistent with 1-AUC, the usual evaluation measurement. The inconsistence results in the deviation of model optimization, which makes a bad influence on filtering results. In this study, spam filtering was transformed into the ranking model through the optimization oriented to 1-AUC with online ranking logistic regression model then proposed to tackle the deviation of the model's score in the online learning module. TONE (train on or near error), re-sampling and weights update methods were used to promote the learning speed in online adjustment of model's parameters. Experiments on open evaluation datasets show that the developed method is better than the traditional online logistic regression model with statistical significance.
引用
收藏
相关论文
共 50 条
  • [41] Spam Filtering Based on degree-of-contribution
    Li, Jun
    He, Xiaoning
    Qi, Haoliang
    2011 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION AND INDUSTRIAL APPLICATION (ICIA2011), VOL IV, 2011, : 116 - 119
  • [42] An Efficient Spam Filtering Algorithm Based on NPE
    Wang, Ziqiang
    Sun, Xia
    2008 IEEE INTERNATIONAL SYMPOSIUM ON KNOWLEDGE ACQUISITION AND MODELING WORKSHOP PROCEEDINGS, VOLS 1 AND 2, 2008, : 1102 - 1104
  • [43] Spam filtering based on latent semantic indexing
    Gansterer, Wilfried N.
    Janecek, Andreas G. K.
    Neumayer, Robert
    SURVEY OF TEXT MINING II: CLUSTERING, CLASSIFICATION, AND RETRIEVAL, 2008, : 165 - +
  • [44] Spam filtering system based on uncertain learning
    Liu Zhen
    Fu Yan
    Xie Feng-zhu
    2009 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, VOL II, PROCEEDINGS, 2009, : 141 - 144
  • [45] Spam Filtering Based on degree-of-contribution
    Li, Jun
    He, Xiaoning
    Qi, Haoliang
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL IX, 2010, : 118 - 121
  • [46] Email Spam Filtering Based on the MNMF Algorithm
    Liu, Zun-xiong
    Tian, Shan-shan
    Huang, Zhi-qiang
    Liu, Jiang-wei
    INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2016, 10 (01): : 31 - 44
  • [47] Spam Mail Filtering Based on Network Processor
    Lin, Lian
    Li, Zhongwen
    Shi, Liang
    2008 IFIP INTERNATIONAL CONFERENCE ON NETWORK AND PARALLEL COMPUTING, PROCEEDINGS, 2008, : 184 - 189
  • [48] A trust based system for enhanced spam filtering
    Telecommunications Software and Systems Group, Waterford Institute of Technology, Waterford, Ireland
    J. Softw., 2008, 5 (55-64):
  • [49] Adaptive Spam Filtering Based on Fingerprint Vectors
    Liu, Weihong
    Fang, Weidong
    2008 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL 1, PROCEEDINGS, 2008, : 384 - 388
  • [50] Filtering spam email based on retry patterns
    Lieven, Peter
    Scheuermann, Bjoern
    Stini, Michael
    Mauve, Martin
    2007 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-14, 2007, : 1515 - 1520