Learning from crowds with decision trees

被引:14
|
作者
Yang, Wenjun [1 ]
Li, Chaoqun [1 ]
Jiang, Liangxiao [2 ]
机构
[1] China Univ Geosci, Sch Math & Phys, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
关键词
Crowdsourcing learning; Weighted majority voting; Decision trees; MODEL QUALITY; STATISTICAL COMPARISONS; WEIGHTING FILTER; IMPROVING DATA; CLASSIFIERS; TOOL;
D O I
10.1007/s10115-022-01701-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crowdsourcing systems provide an efficient way to collect labeled data by employing non-expert crowd workers. In practice, each instance obtains a multiple noisy label set from different workers. Ground truth inference algorithms are designed to infer the unknown true labels of data from multiple noisy label sets. Since there is substantial variation among different workers, evaluating the qualities of workers is crucial for ground truth inference. This paper proposes a novel algorithm called decision tree-based weighted majority voting (DTWMV). DTWMV directly takes the multiple noisy label set of each instance as its feature vector; that is, each worker is a feature of instances. Then sequential decision trees are built to calculate the weight of each feature (worker). Finally weighted majority voting is used to infer the integrated labels of instances. In DTWMV, evaluating the qualities of workers is converted to calculating the weights of features, which provides a new perspective for solving the ground truth inference problem. Then, a novel feature weight measurement based on decision trees is proposed. Our experimental results show that DTWMV can effectively evaluate the qualities of workers and improve the label quality of data.
引用
收藏
页码:2123 / 2140
页数:18
相关论文
共 50 条
  • [1] Learning from crowds with decision trees
    Wenjun Yang
    Chaoqun Li
    Liangxiao Jiang
    Knowledge and Information Systems, 2022, 64 : 2123 - 2140
  • [2] Circuit Learning: From Decision Trees to Decision Graphs
    Huang, Yu-Shan
    Jiang, Jie-Hong R.
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (11) : 3985 - 3996
  • [3] Learning From Crowds
    Raykar, Vikas C.
    Yu, Shipeng
    Zhao, Linda H.
    Valadez, Gerardo Hermosillo
    Florin, Charles
    Bogoni, Luca
    Moy, Linda
    JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 1297 - 1322
  • [4] Learning from crowds
    Raykar, Vikas C.
    Yu, Shipeng
    Zhao, Linda H.
    Valadez, Gerardo Hermosillo
    Florin, Charles
    Bogoni, Luca
    Moy, Linda
    Journal of Machine Learning Research, 2010, 11 : 1297 - 1322
  • [5] Learning Decision Trees from Distributed Datasets
    Xie Hongxia
    Shi Liping
    Meng Fanrong
    Wang Chun
    DCABES 2008 PROCEEDINGS, VOLS I AND II, 2008, : 96 - +
  • [6] Learning from imperfect examples in decision trees
    Janikow, CZ
    COMPUTERS AND THEIR APPLICATIONS - PROCEEDINGS OF THE ISCA 11TH INTERNATIONAL CONFERENCE, 1996, : 71 - 74
  • [7] LEARNING DECISION TREES FROM RANDOM EXAMPLES
    EHRENFEUCHT, A
    HAUSSLER, D
    INFORMATION AND COMPUTATION, 1989, 82 (03) : 231 - 246
  • [8] Adversarial Learning from Crowds
    Chen, Pengpeng
    Sun, Hailong
    Yang, Yongqiang
    Chen, Zhijun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 5304 - 5312
  • [9] A New Method for Learning Decision Trees from Rules
    Abdelhalim, Amany
    Traore, Issa
    EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 693 - 698
  • [10] Learning decision trees from dynamic data streams
    Gama, J
    Medas, P
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2005, 11 (08) : 1353 - 1366