Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm

被引：2

作者：

Xu, Zuobing ^{[1
]}

Hogan, Christopher ^{[2
]}

Bauer, Robert ^{[2
]}

机构：

[1] eBay Inc, San Jose, CA 95125 USA

[2] H5 Inc, San Francisco, CA 94105 USA

来源：

2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009) | 2009年

关键词：

large scale; active learning; greedy algorithm; submodular functions;

D O I：

10.1109/ICDMW.2009.38

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Active learning algorithms actively select training examples to acquire labels from domain experts, which are very effective to reduce human labeling effort in the context of supervised learning. To reduce computational time in training, as well as provide more convenient user interaction environment, it is necessary to select batches of new training examples instead of a single example. Batch mode active learning algorithms incorporate a diversity measure to construct a batch of diversified candidate examples. Existing approaches use greedy algorithms to make it feasible to the scale of thousands of data. Greedy algorithms, however, are not efficient enough to scale to even larger real world classification applications, which contain millions of data. In this paper, we present an extremely efficient active learning algorithm. This new active learning algorithm achieves the same results as the traditional greedy algorithm, while the run time is reduced by a factor of several hundred times. We prove that the objective function of the algorithm is submodular, which guarantees to find the same solution as the greedy algorithm. We evaluate our approach on several large scale real-world text classification problems, and show that our new approach achieves substantial speedups, while obtaining the same classification accuracy.

引用

页码：326 / +

页数：2

共 50 条

[41] BatchRank: A Novel Batch Mode Active Learning Framework for Hierarchical Classification
Chakraborty, Shayok
Balasubramanian, Vineeth
Sankar, Adepu Ravi
Panchanathan, Sethuraman
Ye, Jieping
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 99 - 108
[42] Batch mode active learning for mitotic phenotypes using conformal prediction
Corrigan, Adam M.
Hopcroft, Philip
Narvaez, Ana J.
Bendtsen, Claus
CONFORMAL AND PROBABILISTIC PREDICTION AND APPLICATIONS, VOL 128, 2020, 128 : 229 - 243
[43] Budgeted Batch Mode Active Learning with Generalized Cost and Utility Functions
Agarwal, Arvind
Mujumdar, Shashank
Gupta, Nitin
Mehta, Sameep
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7692 - 7698
[44] Exploring chemical and conformational spaces by batch mode deep active learning
Zaverkin, Viktor
Holzmueller, David
Steinwart, Ingo
Kaestner, Johannes
DIGITAL DISCOVERY, 2022, 1 (05): : 605 - 620
[45] A cluster-assumption based batch mode active learning technique
Patra, Swarnajyoti
Bruzzone, Lorenzo
PATTERN RECOGNITION LETTERS, 2012, 33 (09) : 1042 - 1048
[46] Batch Mode Active Learning Based on Multi-Set Clustering
Yang, Yazhou
Yin, Xiaoqing
Zhao, Yang
Lei, Jun
Li, Weili
Shu, Zhe
IEEE ACCESS, 2021, 9 : 51452 - 51463
[47] Batch Mode Active Learning for Networked Data with Optimal Subset Selection
Xu, Haihui
Zhao, Pengpeng
Sheng, Victor S.
Liu, Guanfeng
Zhao, Lei
Wu, Jian
Cui, Zhiming
WEB-AGE INFORMATION MANAGEMENT (WAIM 2015), 2015, 9098 : 96 - 108
[48] A Multicriterion Query-Based Batch Mode Active Learning Technique
Jiao, Yang
Zhao, Pengpeng
Wu, Jian
Shi, Yujie
Cui, Zhiming
FOUNDATIONS OF INTELLIGENT SYSTEMS (ISKE 2013), 2014, 277 : 669 - 680
[49] Batch-Mode Active Learning via Error Bound Minimization
Gu, Quanquan
Zhang, Tong
Han, Jiawei
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2014, : 300 - 309
[50] BATCH MODE ACTIVE LEARNING FOR GRAPH-BASED SEMI-SUPERVISED LEARNING
Park, Cheong Hee
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2013, 27 (07)

← 1 2 3 4 5 →