Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm

被引:2
|
作者
Xu, Zuobing [1 ]
Hogan, Christopher [2 ]
Bauer, Robert [2 ]
机构
[1] eBay Inc, San Jose, CA 95125 USA
[2] H5 Inc, San Francisco, CA 94105 USA
关键词
large scale; active learning; greedy algorithm; submodular functions;
D O I
10.1109/ICDMW.2009.38
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Active learning algorithms actively select training examples to acquire labels from domain experts, which are very effective to reduce human labeling effort in the context of supervised learning. To reduce computational time in training, as well as provide more convenient user interaction environment, it is necessary to select batches of new training examples instead of a single example. Batch mode active learning algorithms incorporate a diversity measure to construct a batch of diversified candidate examples. Existing approaches use greedy algorithms to make it feasible to the scale of thousands of data. Greedy algorithms, however, are not efficient enough to scale to even larger real world classification applications, which contain millions of data. In this paper, we present an extremely efficient active learning algorithm. This new active learning algorithm achieves the same results as the traditional greedy algorithm, while the run time is reduced by a factor of several hundred times. We prove that the objective function of the algorithm is submodular, which guarantees to find the same solution as the greedy algorithm. We evaluate our approach on several large scale real-world text classification problems, and show that our new approach achieves substantial speedups, while obtaining the same classification accuracy.
引用
收藏
页码:326 / +
页数:2
相关论文
共 50 条
  • [41] BatchRank: A Novel Batch Mode Active Learning Framework for Hierarchical Classification
    Chakraborty, Shayok
    Balasubramanian, Vineeth
    Sankar, Adepu Ravi
    Panchanathan, Sethuraman
    Ye, Jieping
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 99 - 108
  • [42] Batch mode active learning for mitotic phenotypes using conformal prediction
    Corrigan, Adam M.
    Hopcroft, Philip
    Narvaez, Ana J.
    Bendtsen, Claus
    CONFORMAL AND PROBABILISTIC PREDICTION AND APPLICATIONS, VOL 128, 2020, 128 : 229 - 243
  • [43] Budgeted Batch Mode Active Learning with Generalized Cost and Utility Functions
    Agarwal, Arvind
    Mujumdar, Shashank
    Gupta, Nitin
    Mehta, Sameep
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7692 - 7698
  • [44] Exploring chemical and conformational spaces by batch mode deep active learning
    Zaverkin, Viktor
    Holzmueller, David
    Steinwart, Ingo
    Kaestner, Johannes
    DIGITAL DISCOVERY, 2022, 1 (05): : 605 - 620
  • [45] A cluster-assumption based batch mode active learning technique
    Patra, Swarnajyoti
    Bruzzone, Lorenzo
    PATTERN RECOGNITION LETTERS, 2012, 33 (09) : 1042 - 1048
  • [46] Batch Mode Active Learning Based on Multi-Set Clustering
    Yang, Yazhou
    Yin, Xiaoqing
    Zhao, Yang
    Lei, Jun
    Li, Weili
    Shu, Zhe
    IEEE ACCESS, 2021, 9 : 51452 - 51463
  • [47] Batch Mode Active Learning for Networked Data with Optimal Subset Selection
    Xu, Haihui
    Zhao, Pengpeng
    Sheng, Victor S.
    Liu, Guanfeng
    Zhao, Lei
    Wu, Jian
    Cui, Zhiming
    WEB-AGE INFORMATION MANAGEMENT (WAIM 2015), 2015, 9098 : 96 - 108
  • [48] A Multicriterion Query-Based Batch Mode Active Learning Technique
    Jiao, Yang
    Zhao, Pengpeng
    Wu, Jian
    Shi, Yujie
    Cui, Zhiming
    FOUNDATIONS OF INTELLIGENT SYSTEMS (ISKE 2013), 2014, 277 : 669 - 680
  • [49] Batch-Mode Active Learning via Error Bound Minimization
    Gu, Quanquan
    Zhang, Tong
    Han, Jiawei
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2014, : 300 - 309