Learning from crowds with decision trees

被引:14
|
作者
Yang, Wenjun [1 ]
Li, Chaoqun [1 ]
Jiang, Liangxiao [2 ]
机构
[1] China Univ Geosci, Sch Math & Phys, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
关键词
Crowdsourcing learning; Weighted majority voting; Decision trees; MODEL QUALITY; STATISTICAL COMPARISONS; WEIGHTING FILTER; IMPROVING DATA; CLASSIFIERS; TOOL;
D O I
10.1007/s10115-022-01701-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crowdsourcing systems provide an efficient way to collect labeled data by employing non-expert crowd workers. In practice, each instance obtains a multiple noisy label set from different workers. Ground truth inference algorithms are designed to infer the unknown true labels of data from multiple noisy label sets. Since there is substantial variation among different workers, evaluating the qualities of workers is crucial for ground truth inference. This paper proposes a novel algorithm called decision tree-based weighted majority voting (DTWMV). DTWMV directly takes the multiple noisy label set of each instance as its feature vector; that is, each worker is a feature of instances. Then sequential decision trees are built to calculate the weight of each feature (worker). Finally weighted majority voting is used to infer the integrated labels of instances. In DTWMV, evaluating the qualities of workers is converted to calculating the weights of features, which provides a new perspective for solving the ground truth inference problem. Then, a novel feature weight measurement based on decision trees is proposed. Our experimental results show that DTWMV can effectively evaluate the qualities of workers and improve the label quality of data.
引用
收藏
页码:2123 / 2140
页数:18
相关论文
共 50 条
  • [21] Enhancing techniques for learning decision trees from imbalanced data
    Chaabane, Ikram
    Guermazi, Radhouane
    Hammami, Mohamed
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2020, 14 (03) : 677 - 745
  • [22] Enhancing techniques for learning decision trees from imbalanced data
    Ikram Chaabane
    Radhouane Guermazi
    Mohamed Hammami
    Advances in Data Analysis and Classification, 2020, 14 : 677 - 745
  • [23] Learning Decision Trees from Data Streams with Concept Drift
    Jankowski, Dariusz
    Jackowski, Konrad
    Cyganek, Boguslaw
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 2016, 80 : 1682 - 1691
  • [24] Adaptive Exact Learning of Decision Trees from Membership Queries
    Bshouty, Nader H.
    Haddad-Zaknoon, Catherine A.
    ALGORITHMIC LEARNING THEORY, VOL 98, 2019, 98
  • [25] Learning From Crowds With Contrastive Representation
    Yang, Hang
    Li, Xunbo
    Pedrycz, Witold
    IEEE ACCESS, 2023, 11 : 40182 - 40191
  • [26] Listwise Learning to Rank from Crowds
    Wu, Ou
    You, Qiang
    Xia, Fen
    Ma, Lei
    Hu, Weiming
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2016, 11 (01)
  • [27] Weighted Adversarial Learning From Crowds
    Chen, Ziqi
    Jiang, Liangxiao
    Zhang, Wenjun
    Li, Chaoqun
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2024, 17 (06) : 4467 - 4480
  • [28] Batch Reinforcement Learning from Crowds
    Zhang, Guoxi
    Kashima, Hisashi
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 : 38 - 51
  • [29] Learning from Crowds with Annotation Reliability
    Cao, Zhi
    Chen, Enhong
    Huang, Ye
    Shen, Shuanghong
    Huang, Zhenya
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2103 - 2107
  • [30] Decision trees and automatic learning in medical decision making
    Zorman, M
    Kokol, P
    INTELLIGENT INFORMATION SYSTEMS, (IIS'97) PROCEEDINGS, 1997, : 37 - 41