Label distribution similarity-based noise correction for crowdsourcing

被引:4
|
作者
Ren, Lijuan [1 ]
Jiang, Liangxiao [1 ]
Zhang, Wenjun [1 ]
Li, Chaoqun [2 ,3 ]
机构
[1] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
[2] Minist Educ, Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
[3] China Univ Geosci, Sch Math & Phys, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
crowdsourcing learning; noise correction; label distribution similarity; kullback-leibler divergence; QUALITY; TOOL;
D O I
10.1007/s11704-023-2751-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In crowdsourcing scenarios, we can obtain each instance's multiple noisy labels from different crowd workers and then infer its integrated label via label aggregation. In spite of the effectiveness of label aggregation methods, there still remains a certain level of noise in the integrated labels. Thus, some noise correction methods have been proposed to reduce the impact of noise in recent years. However, to the best of our knowledge, existing methods rarely consider an instance's information from both its features and multiple noisy labels simultaneously when identifying a noise instance. In this study, we argue that the more distinguishable an instance's features but the noisier its multiple noisy labels, the more likely it is a noise instance. Based on this premise, we propose a label distribution similarity-based noise correction (LDSNC) method. To measure whether an instance's features are distinguishable, we obtain each instance's predicted label distribution by building multiple classifiers using instances' features and their integrated labels. To measure whether an instance's multiple noisy labels are noisy, we obtain each instance's multiple noisy label distribution using its multiple noisy labels. Then, we use the Kullback-Leibler (KL) divergence to calculate the similarity between the predicted label distribution and multiple noisy label distribution and define the instance with the lower similarity as a noise instance. The extensive experimental results on 34 simulated and four real-world crowdsourced datasets validate the effectiveness of our method.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Neighborhood Weighted Voting-Based Noise Correction for Crowdsourcing
    Li, Huiru
    Jiang, Liangxiao
    Xue, Siqing
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2023, 17 (07)
  • [32] Certainty weighted voting-based noise correction for crowdsourcing
    Li, Huiru
    Jiang, Liangxiao
    Li, Chaoqun
    Pattern Recognition, 2024, 150
  • [33] Certainty weighted voting-based noise correction for crowdsourcing
    Li, Huiru
    Jiang, Liangxiao
    Li, Chaoqun
    PATTERN RECOGNITION, 2024, 150
  • [34] Similarity of Query Results in Similarity-Based Databases
    Belohlavek, Radim
    Urbanova, Lucie
    Vychodil, Vilem
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2011, 6954 : 258 - 267
  • [35] A similarity-based resolution rule
    Fontana, FA
    Formato, F
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2002, 17 (09) : 853 - 872
  • [36] Similarity-based Product Configuration
    Schuh, Guenther
    Rudolf, Stefan
    Riesener, Michael
    VARIETY MANAGEMENT IN MANUFACTURING: PROCEEDINGS OF THE 47TH CIRP CONFERENCE ON MANUFACTURING SYSTEMS, 2014, 17 : 290 - 295
  • [37] NOISE CORRECTION OF IMAGE LABELING IN CROWDSOURCING
    Nicholson, Bryce
    Sheng, Victor S.
    Zhang, Jing
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 1458 - 1462
  • [38] A similarity-based approach to aggregation
    Jacas, J
    Recasens, J
    FUZZ-IEEE 2005: PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS: BIGGEST LITTLE CONFERENCE IN THE WORLD, 2005, : 658 - 662
  • [39] Sparse Similarity-Based Fisherfaces
    Fagertun, Jens
    Gomez, David D.
    Hansen, Mads F.
    Paulsen, Rasmus R.
    IMAGE ANALYSIS: 17TH SCANDINAVIAN CONFERENCE, SCIA 2011, 2011, 6688 : 69 - 78
  • [40] Similarity-based alignment and generalization
    Oblinger, D
    Castelli, V
    Lau, T
    Bergman, LD
    MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 657 - 664