Filtering out Outliers in Learning to Rank

被引:5
|
作者
Marcuzzi, Federico [1 ]
Lucchese, Claudio [1 ]
Orlando, Salvatore [1 ]
机构
[1] Univ Ca Foscari Venezia, Venice, Italy
关键词
information retrieval; learning to rank; machine learning;
D O I
10.1145/3539813.3545127
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier data points are known to affect negatively the learning process of regression or classification models, yet their impact in the learning-to-rank scenario has not been thoroughly investigated so far. In this work we propose SOUR, a learning-to-rank method that detects and removes outliers before building an effective ranking model. We limit our analysis to gradient boosting decision trees, where SOUR searches for outlier instances that are incorrectly ranked in several iterations of the learning process. Extensive experiments show that removing a limited number of outlier data instances before re-training a new model provides statistically significant improvements, and that SOUR outperforms state-of-the-art de-noising and outlier detection methods.
引用
收藏
页码:125 / 133
页数:9
相关论文
共 50 条
  • [1] Learning to rank for collaborative filtering
    Pessiot, Jean-Francois
    Truong, Tuong-Vinh
    Usunier, Nicolas
    Amini, Massih-Reza
    Gallinari, Patrick
    ICEIS 2007: PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS, 2007, : 145 - 151
  • [2] Parallel pairwise learning to rank for collaborative filtering
    Yagci, A. Murat
    Aytekin, Tevfik
    Gurgen, Fikret S.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (15):
  • [3] Improving Bug Fix-Time Prediction Model by Filtering out Outliers
    AbdelMoez, W.
    Kholief, Mohamed
    Elsalmy, Fayrouz M.
    2013 INTERNATIONAL CONFERENCE ON TECHNOLOGICAL ADVANCES IN ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING (TAEECE), 2013, : 359 - 364
  • [4] ON A RANK SUM TEST FOR OUTLIERS
    THOMPSON, WA
    WILLKE, TA
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1963, 58 (302) : 563 - +
  • [5] PARTICLE FILTERING IN THE PRESENCE OF OUTLIERS
    Maiz, Cristina S.
    Miguez, Joaquin
    Djuric, Petar M.
    2009 IEEE/SP 15TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, 2009, : 33 - +
  • [6] ADAPTIVE FILTERING IN THE PRESENCE OF OUTLIERS
    Besson, Olivier
    Bidon, Stephanie
    2012 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2012, : 205 - 208
  • [7] IN A RANK AND OUT OF A RANK
    Galtseva, R. A.
    NOVYI MIR, 2013, (07): : 153 - 164
  • [8] ON AN EXTREME RANK SUM TEST FOR OUTLIERS
    THOMPSON, WA
    WILLKE, TA
    BIOMETRIKA, 1963, 50 (3-4) : 375 - &
  • [9] Improving Process Discovery Results by Filtering Out Outliers from Event Logs with Hidden Markov Models
    Zhang, Zhenyu
    Hildebrant, Ryan
    Asgarinejad, Fatemeh
    Venkatasubramanian, Nalini
    Ren, Shangping
    2021 IEEE 23RD CONFERENCE ON BUSINESS INFORMATICS, CBI 2021, VOL 1, 2021, : 171 - 180
  • [10] A RANK CORRELATION-COEFFICIENT RESISTANT TO OUTLIERS
    GIDEON, RA
    HOLLISTER, RA
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1987, 82 (398) : 656 - 666