Adaptive batch mode active learning with deep similarity

被引：1

作者：

Zhang, Kaiyuan ^{[1
]}

Qian, Buyue ^{[2
]}

Wei, Jishang ^{[3
]}

Yin, Changchang ^{[1
]}

Cao, Shilei ^{[1
]}

Li, Xiaoyu ^{[1
]}

Cao, Yanjun ^{[4
]}

Zheng, Qinghua ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Elect & Informat Engn, Xian 710049, Shaanxi, Peoples R China

[2] Capital Med Univ, Beijing Chaoyang Hosp, Beijing 100020, Peoples R China

[3] HP Labs, 1501 Page Mill Rd, Palo Alto, CA 94304 USA

[4] Northwest Univ, Biomed Key Lab Shaanxi Prov, Xian 710069, Peoples R China

来源：

EGYPTIAN INFORMATICS JOURNAL | 2023年 / 24卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Active learning; Adaptive batch mode active learning; Classification model; Deep neural network; Deep learning; CLASSIFICATION;

D O I：

10.1016/j.eij.2023.100412

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Active learning is usually used in scenarios where few labels are available and manual labeling is expensive. To improve model performance, it is necessary to find the most valuable instance among all instances and label it to maximize the benefits of labeling. In practical scenarios, it is often more efficient to query a group of instances instead of a individual instance during each iteration. To achieve this goal, we need to explore the similarities between instances to ensure the informativeness and diversity. Many ad-hoc algorithms are proposed for batch mode active learning, and there are generally two major issues. One is that similarity measurement among in-stances often only relies on the expression of features but it is not well integrated with the classification algo-rithm model. This will cut down the precise measurement of diversity. The other is that in order to explore the decision boundary, these algorithms often choose the instance near the boundary. It is difficult to get the true boundary when there are few labeled instances. As a large number of instances continue to be labeled, infor-mation between instances is less used, and the performance will be greatly improved if it is properly used. In our work, we propose an adaptive algorithm based on deep neural networks to solve the two problems mentioned above. During the training phase, we established a paired network to improve the accuracy of the classification model, and the network can project the instance to a new feature space for more accurate similarity measure-ment. When batch labeling instances, we use the adaptive algorithm to select the instance by balancing the maximum uncertainty (exploration) and diversity (exploitation). Our algorithm has been validated for heart failure prediction tasks in real-world EHR datasets. Due to the no public of EHR data, we also conducted vali-dation on two other classic classification tasks. Our algorithm is superior to the baseline method in both accuracy and convergence rate.

引用

页数：11

共 50 条

[21] Querying Discriminative and Representative Samples for Batch Mode Active Learning
Wang, Zheng
Ye, Jieping
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2015, 9 (03) : 17
[22] Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm
Xu, Zuobing
Hogan, Christopher
Bauer, Robert
2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 326 - +
[23] Context Aware Image Annotation in Active Learning with Batch Mode
Sun, Yingcheng
Loparo, Kenneth
2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 952 - 953
[24] Batch Mode Active Learning for Individual Treatment Effect Estimation
Puha, Zoltan
Kaptein, Maurits
Lemmens, Aurelie
20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 859 - 866
[25] Querying Discriminative and Representative Samples for Batch Mode Active Learning
Wang, Zheng
Ye, Jieping
19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 158 - 166
[26] Batch Mode Active Learning for Regression With Expected Model Change
Cai, Wenbin
Zhang, Muhan
Zhang, Ya
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (07) : 1668 - 1681
[27] Adaptive Batch Sizes for Active Learning: A Probabilistic Numerics Approach
Adachi, Masaki
Hayakawa, Satoshi
Jorgensen, Martin
Wan, Xingchen
Vu Nguyen
Oberhauser, Harald
Osborne, Michael A.
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[28] Batch Active Learning for Multispectral and Hyperspectral Image Segmentation Using Similarity Graphs
Chen, Bohan
Miller, Kevin
Bertozzi, Andrea L.
Schwenk, Jon
COMMUNICATIONS ON APPLIED MATHEMATICS AND COMPUTATION, 2024, 6 (02) : 1013 - 1033
[29] Batch Mode Adaptive Multiple Instance Learning for Computer Vision Tasks
Li, Wen
Duan, Lixin
Tsang, Ivor Wai-Hung
Xu, Dong
2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2368 - 2375
[30] Semisupervised SVM Batch Mode Active Learning with Applications to Image Retrieval
Hoi, Steven C. H.
Jin, Rong
Zhu, Jianke
Lyu, Michael R.
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2009, 27 (03)

← 1 2 3 4 5 →