Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision

被引:29
|
作者
Kaushal, Vishal [1 ]
Iyer, Rishabh [2 ]
Kothawade, Suraj [1 ]
Mahadev, Rohan [3 ]
Doctor, Khoshrav [4 ]
Ramakrishnan, Ganesh [1 ]
机构
[1] Indian Inst Technol, Mumbai, Maharashtra, India
[2] Microsoft, Redmond, WA USA
[3] AITOE Labs, Mumbai, Maharashtra, India
[4] Univ Massachusetts, Amherst, MA 01003 USA
关键词
D O I
10.1109/WACV.2019.00142
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry. Their data curation poses the challenges of expensive human labeling, inadequate computing resources and larger experiment turn around times. Training data subset selection and active learning techniques have been proposed as possible solutions to these challenges. A special class of subset selection functions naturally model notions of diversity, coverage and representation and can be used to eliminate redundancy thus lending themselves well for training data subset selection. They can also help improve the efficiency of active learning in further reducing human labeling efforts by selecting a subset of the examples obtained using the conventional uncertainty sampling based techniques. In this work, we empirically demonstrate the effectiveness of two diversity models, namely the Facility-Location and Dispersion models for training-data subset selection and reducing labeling effort. We demonstrate this across the board for a variety of computer vision tasks including Gender Recognition, Face Recognition, Scene Recognition, Object Detection and Object Recognition. Our results show that diversity based subset selection done in the right way can increase the accuracy by upto 5 - 10% over existing baselines, particularly in settings in which less training data is available. This allows the training of complex machine learning models like Convolutional Neural Networks with much less training data and labeling costs while incurring minimal performance loss.
引用
收藏
页码:1289 / 1299
页数:11
相关论文
共 50 条
  • [41] A Unified Active Learning Framework for Biomedical Relation Extraction
    Hong-Tao Zhang
    Min-Lie Huang
    Xiao-Yan Zhu
    Journal of Computer Science and Technology, 2012, 27 : 1302 - 1313
  • [42] A Unified Active Learning Framework for Biomedical Relation Extraction
    Zhang, Hong-Tao
    Huang, Min-Lie
    Zhu, Xiao-Yan
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2012, 27 (06) : 1302 - 1313
  • [43] A Unified Active Learning Framework for Biomedical Relation Extraction
    张宏涛
    黄民烈
    朱小燕
    Journal of Computer Science & Technology, 2012, 27 (06) : 1302 - 1313
  • [44] Harvestman: a framework for hierarchical feature learning and selection from whole genome sequencing data
    Frisby, Trevor S.
    Baker, Shawn J.
    Marcais, Guillaume
    Hoang, Quang Minh
    Kingsford, Carl
    Langmead, Christopher J.
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [45] Harvestman: a framework for hierarchical feature learning and selection from whole genome sequencing data
    Trevor S. Frisby
    Shawn J. Baker
    Guillaume Marçais
    Quang Minh Hoang
    Carl Kingsford
    Christopher J. Langmead
    BMC Bioinformatics, 22
  • [46] LEARNING TO INTEGRATE VISION DATA INTO ROAD NETWORK DATA
    Stromann, Oliver
    Razavi, Alireza
    Felsberg, Michael
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4548 - 4552
  • [47] LEAF: A Less Expert Annotation Framework with Active Learning
    Maoliniyazi, Aishan
    Ma, Chaohong
    Meng, Xiaofeng
    Peng, Yingtao
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 369 - 384
  • [48] Subset selection of training data for machine learning: a situational awareness system case study
    McKenzie, M.
    Wong, S. C.
    NEXT-GENERATION ROBOTICS II; AND MACHINE INTELLIGENCE AND BIO-INSPIRED COMPUTATION: THEORY AND APPLICATIONS IX, 2015, 9494
  • [49] A Serial Sample Selection Framework for Active Learning
    Li, Chengchao
    Zhao, Pengpeng
    Wu, Jian
    Xu, Haihui
    Cui, Zhiming
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2014, 2014, 8933 : 435 - 446
  • [50] Deep Active Learning for Computer Vision: Past and Future
    Takezoe, Rinyoichi
    Liu, Xu
    Mao, Shunan
    Chen, Marco Tianyu
    Feng, Zhanpeng
    Zhang, Shiliang
    Wang, Xiaoyu
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2023, 12 (01)