Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm

被引:2
|
作者
Xu, Zuobing [1 ]
Hogan, Christopher [2 ]
Bauer, Robert [2 ]
机构
[1] eBay Inc, San Jose, CA 95125 USA
[2] H5 Inc, San Francisco, CA 94105 USA
关键词
large scale; active learning; greedy algorithm; submodular functions;
D O I
10.1109/ICDMW.2009.38
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Active learning algorithms actively select training examples to acquire labels from domain experts, which are very effective to reduce human labeling effort in the context of supervised learning. To reduce computational time in training, as well as provide more convenient user interaction environment, it is necessary to select batches of new training examples instead of a single example. Batch mode active learning algorithms incorporate a diversity measure to construct a batch of diversified candidate examples. Existing approaches use greedy algorithms to make it feasible to the scale of thousands of data. Greedy algorithms, however, are not efficient enough to scale to even larger real world classification applications, which contain millions of data. In this paper, we present an extremely efficient active learning algorithm. This new active learning algorithm achieves the same results as the traditional greedy algorithm, while the run time is reduced by a factor of several hundred times. We prove that the objective function of the algorithm is submodular, which guarantees to find the same solution as the greedy algorithm. We evaluate our approach on several large scale real-world text classification problems, and show that our new approach achieves substantial speedups, while obtaining the same classification accuracy.
引用
收藏
页码:326 / +
页数:2
相关论文
共 50 条
  • [1] EFFICIENT BATCH-MODE ACTIVE LEARNING OF RANDOM FOREST
    Nguyen, Hieu T.
    Yadegar, Joseph
    Kong, Bailey
    Wei, Hai
    2012 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2012, : 596 - 599
  • [2] Efficient Transport Simulation With Restricted Batch-Mode Active Learning
    Antunes, Francisco
    Ribeiro, Bernardete
    Pereira, Francisco C.
    Gomes, Rui
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 19 (11) : 3642 - 3651
  • [3] Adaptive Batch Mode Active Learning
    Chakraborty, Shayok
    Balasubramanian, Vineeth
    Panchanathan, Sethuraman
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (08) : 1747 - 1760
  • [4] Dynamic Batch Mode Active Learning
    Chakraborty, Shayok
    Balasubramanian, Vineeth
    Panchanathan, Sethuraman
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011,
  • [5] Greedy active learning algorithm for logistic regression models
    Hsu, Hsiang-Ling
    Chang, Yuan-chin Ivan
    Chen, Ray-Bing
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2019, 129 : 119 - 134
  • [6] Batch Mode Active Learning for Networked Data
    Shi, Lixin
    Zhao, Yuhang
    Tang, Jie
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2012, 3 (02)
  • [7] Batch Mode Active Learning for Biometric Recognition
    Chakraborty, Shayok
    Balasubramanian, Vineeth
    Panchanathan, Sethuraman
    BIOMETRIC TECHNOLOGY FOR HUMAN IDENTIFICATION VII, 2010, 7667
  • [8] Ranked batch-mode active learning
    Cardoso, Thiago N. C.
    Silva, Rodrigo M.
    Canuto, Sergio
    Moro, Mirella M.
    Goncalves, Marcos A.
    INFORMATION SCIENCES, 2017, 379 : 313 - 337
  • [9] Batch Mode Active Learning for Interactive Image Retrieval
    Ngo Truong Giang
    Ngo Quoc Tao
    Nguyen Duc Dung
    Nguyen Trong The
    2014 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2014, : 28 - 31
  • [10] Batch Mode Active Learning for Geographical Image Classification
    Wang, Zengmao
    Du, Bo
    Zhang, Lefei
    Hu, Wenbin
    Tao, Dacheng
    Zhang, Liangpei
    WEB TECHNOLOGIES AND APPLICATIONS (APWEB 2015), 2015, 9313 : 744 - 755