Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm

被引:2
|
作者
Xu, Zuobing [1 ]
Hogan, Christopher [2 ]
Bauer, Robert [2 ]
机构
[1] eBay Inc, San Jose, CA 95125 USA
[2] H5 Inc, San Francisco, CA 94105 USA
关键词
large scale; active learning; greedy algorithm; submodular functions;
D O I
10.1109/ICDMW.2009.38
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Active learning algorithms actively select training examples to acquire labels from domain experts, which are very effective to reduce human labeling effort in the context of supervised learning. To reduce computational time in training, as well as provide more convenient user interaction environment, it is necessary to select batches of new training examples instead of a single example. Batch mode active learning algorithms incorporate a diversity measure to construct a batch of diversified candidate examples. Existing approaches use greedy algorithms to make it feasible to the scale of thousands of data. Greedy algorithms, however, are not efficient enough to scale to even larger real world classification applications, which contain millions of data. In this paper, we present an extremely efficient active learning algorithm. This new active learning algorithm achieves the same results as the traditional greedy algorithm, while the run time is reduced by a factor of several hundred times. We prove that the objective function of the algorithm is submodular, which guarantees to find the same solution as the greedy algorithm. We evaluate our approach on several large scale real-world text classification problems, and show that our new approach achieves substantial speedups, while obtaining the same classification accuracy.
引用
收藏
页码:326 / +
页数:2
相关论文
共 50 条
  • [31] Semisupervised SVM Batch Mode Active Learning with Applications to Image Retrieval
    Hoi, Steven C. H.
    Jin, Rong
    Zhu, Jianke
    Lyu, Michael R.
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2009, 27 (03)
  • [32] Batch-Mode Active Learning for Technology-Assisted Review
    Saha, Tanay Kumar
    Al Hasan, Mohammad
    Burgess, Chandler
    Habib, Md Ahsan
    Johnson, Jeff
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1134 - 1143
  • [33] Asymmetric propagation based batch mode active learning for image retrieval
    Niu, Biao
    Cheng, Jian
    Bai, Xiao
    Lu, Hanqing
    SIGNAL PROCESSING, 2013, 93 (06) : 1639 - 1650
  • [34] Cluster optimized batch mode active learning sample selection method
    He, Zhonghai
    Xia, Zhichao
    Du, Yinzhi
    Zhang, Xiaofang
    INFRARED PHYSICS & TECHNOLOGY, 2025, 145
  • [35] Batch Mode Active Learning for Node Classification in Assortative and Disassortative Networks
    Ping, Shuqiu
    Liu, Dayou
    Yang, Bo
    Zhu, Yungang
    Chen, Hechang
    Wang, Zheng
    IEEE ACCESS, 2018, 6 : 4750 - 4758
  • [36] Correction to: Batch mode active learning via adaptive criteria weights
    Hao Li
    Yongli Wang
    Yanchao Li
    Gang Xiao
    Peng Hu
    Ruxin Zhao
    Applied Intelligence, 2021, 51 (6) : 3490 - 3490
  • [37] A novel batch-mode active learning method for SVM classifier
    Liu, Kang
    Qian, Xu
    Journal of Information and Computational Science, 2012, 9 (16): : 5077 - 5084
  • [38] NimbleLearn: A Scalable and Fast Batch-mode Active Learning Approach
    Kong, Ruoyan
    Qiu, Zhanlong
    Liu, Yang
    Zhao, Qi
    21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 350 - 359
  • [39] A novel batch-mode active learning method for SVM classifier
    Liu, K. (liukang1112@gmail.com), 1600, Binary Information Press, Flat F 8th Floor, Block 3, Tanner Garden, 18 Tanner Road, Hong Kong (09):
  • [40] Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval
    Hoi, Steven C. H.
    Jin, Rong
    Lyu, Michael R.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) : 1233 - 1248