Learning from crowds with sparse and imbalanced annotations

被引:0
|
作者
Ye Shi
Shao-Yuan Li
Sheng-Jun Huang
机构
[1] Nanjing University of Aeronautics and Astronautics,College of Computer Science and Technology
来源
Machine Learning | 2023年 / 112卷
关键词
Crowdsourcing; Sparse annotations; Class-imbalance; Self-training;
D O I
暂无
中图分类号
学科分类号
摘要
Traditional supervised learning requires ground truth labels for training, whose collection however is difficult in many cases. Recently, crowdsourcing has established itself as an efficient labeling solution by resorting to non-expert crowds. To reduce the labeling error effects, one common practice is to distribute each instance to multiple workers, whereas each worker only annotates a subset of data, resulting in the sparse annotation phenomenon. In this paper, we show that when meeting with class-imbalance, i.e., even when the groundtruth labels are slightly imbalanced, the sparse annotations are prone to be skewly distributed and would bias the learning algorithm severely. To combat this issue, we propose one Distribution Aware Self-training based Crowdsourcing learning (DASC) approach, which supplements the sparse annotations by adding confident pseudo-annotations and at the same time re-balancing the annotation distribution. Specifically, we propose one distribution aware confidence measure to select the most confident pseudo-annotations, with minority/majority classes selected more/less frequently. As a universal framework, DASC is applicable to various crowdsourcing methods for consistent performance gains. We conduct extensive experiments over real-world crowdsourcing benchmarks, from slight to heavy imbalance ratio, with various annotation sparsity levels, and show that DASC substantially improves previous crowdsourcing models by 2%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2\%$$\end{document}-20%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$20\%$$\end{document} absolute test accuracy, and yields much more balanced annotations.
引用
收藏
页码:1823 / 1845
页数:22
相关论文
共 50 条
  • [1] Learning from crowds with sparse and imbalanced annotations
    Shi, Ye
    Li, Shao-Yuan
    Huang, Sheng-Jun
    MACHINE LEARNING, 2023, 112 (06) : 1823 - 1845
  • [2] Coupled Confusion Correction: Learning from Crowds with Sparse Annotations
    Zhang, Hansong
    Li, Shikun
    Zeng, Dan
    Yan, Chenggang
    Ge, Shiming
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 15, 2024, : 16732 - 16740
  • [3] Revisiting Machine Learning from Crowds a Mixture Model for Grouping Annotations
    Mena, Francisco
    Nanculef, Ricardo
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 : 493 - 503
  • [4] CoralSeg: Learning coral segmentation from sparse annotations
    Alonso, Inigo
    Yuval, Matan
    Eyal, Gal
    Treibitz, Tali
    Murillo, Ana C.
    JOURNAL OF FIELD ROBOTICS, 2019, 36 (08) : 1456 - 1477
  • [5] Learning Semantic Correspondence with Sparse Annotations
    Huang, Shuaiyi
    Yang, Luyu
    He, Bo
    Zhang, Songyang
    He, Xuming
    Shrivastava, Abhinav
    COMPUTER VISION - ECCV 2022, PT XIV, 2022, 13674 : 267 - 284
  • [6] Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues
    Neverova, Natalia
    Thewlis, James
    Guler, Riza Alp
    Kokkinos, Iasonas
    Vedaldi, Andrea
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10907 - 10915
  • [7] Discriminative Sparse Neighbor Approximation for Imbalanced Learning
    Huang, Chen
    Loy, Chen Change
    Tang, Xiaoou
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1503 - 1513
  • [8] Learning From Crowds
    Raykar, Vikas C.
    Yu, Shipeng
    Zhao, Linda H.
    Valadez, Gerardo Hermosillo
    Florin, Charles
    Bogoni, Luca
    Moy, Linda
    JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 1297 - 1322
  • [9] Learning from crowds
    Raykar, Vikas C.
    Yu, Shipeng
    Zhao, Linda H.
    Valadez, Gerardo Hermosillo
    Florin, Charles
    Bogoni, Luca
    Moy, Linda
    Journal of Machine Learning Research, 2010, 11 : 1297 - 1322
  • [10] 3D Segmentation Learning From Sparse Annotations and Hierarchical Descriptors
    Yin, Peng
    Xu, Lingyun
    Ji, Jianmin
    Scherer, Sebastian
    Choset, Howie
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (03) : 5953 - 5960