Synthetic Augmentation with Large-Scale Unconditional Pre-training

被引：8

作者：

Ye, Jiarong ^{[1
]}

Ni, Haomiao ^{[1
]}

Jin, Peng ^{[1
]}

Huang, Sharon X. ^{[1
]}

Xue, Yuan ^{[2
,3
]}

机构：

[1] Penn State Univ, University Pk, PA 16802 USA

[2] Johns Hopkins Univ, Baltimore, MD 21218 USA

[3] Ohio State Univ, Columbus, OH 43210 USA

来源：

MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT II | 2023年 / 14221卷

关键词：

D O I：

10.1007/978-3-031-43895-0_71

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning based medical image recognition systems often require a substantial amount of training data with expert annotations, which can be expensive and time-consuming to obtain. Recently, synthetic augmentation techniques have been proposed to mitigate the issue by generating realistic images conditioned on class labels. However, the effectiveness of these methods heavily depends on the representation capability of the trained generative model, which cannot be guaranteed without sufficient labeled training data. To further reduce the dependency on annotated data, we propose a synthetic augmentation method called HistoDiffusion, which can be pre-trained on large-scale unlabeled datasets and later applied to a small-scale labeled dataset for augmented training. In particular, we train a latent diffusion model (LDM) on diverse unlabeled datasets to learn common features and generate realistic images without conditional inputs. Then, we fine-tune the model with classifier guidance in latent space on an unseen labeled dataset so that the model can synthesize images of specific categories. Additionally, we adopt a selective mechanism to only add synthetic samples with high confidence of matching to target labels. We evaluate our proposed method by pre-training on three histopathology datasets and testing on a histopathology dataset of colorectal cancer (CRC) excluded from the pre-training datasets. With HistoDiffusion augmentation, the classification accuracy of a backbone classifier is remarkably improved by 6.4% using a small set of the original labels. Our code is available at https://github.com/karenyyy/HistoDiffAug.

引用

页码：754 / 764

页数：11

共 50 条

[41] Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Pan, Yingwei
Li, Yehao
Luo, Jianjie
Xu, Jun
Yao, Ting
Tao Mei
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7070 - 7074
[42] Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition
Masumura, Ryo
Makishima, Naoki
Ihori, Mana
Takashima, Akihiko
Tanaka, Tomohiro
Orihashi, Shota
INTERSPEECH 2020, 2020, : 2822 - 2826
[43] GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training
Deng, Xinchi
Shi, Han
Huang, Runhui
Li, Changlin
Xu, Hang
Han, Jianhua
Kwok, James
Zhao, Shen
Zhang, Wei
Liang, Xiaodan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22121 - 22132
[44] Foundations and Applications in Large-scale AI Models: Pre-training, Fine-tuning, and Prompt-based Learning
Cheng, Derek
Patel, Dhaval
Pang, Linsey
Mehta, Sameep
Xie, Kexin
Chi, Ed H.
Liu, Wei
Chawla, Nitesh
Bailey, James
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 5853 - 5854
[45] Evaluating synthetic pre-Training for handwriting processing tasks
Pippi, Vittorio
Cascianelli, Silvia
Baraldi, Lorenzo
Cucchiara, Rita
PATTERN RECOGNITION LETTERS, 2023, 172 : 44 - 50
[46] Insights into Pre-training via Simpler Synthetic Tasks
Wu, Yuhuai
Li, Felix
Liang, Percy
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[47] Synthetic Pre-Training Tasks for Neural Machine Translation
He, Zexue
Blackwood, Graeme
Panda, Rameswar
McAuley, Julian
Feris, Rogerio
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8080 - 8098
[48] SMILES-BERT: Large Scale Unsupervised Pre-Training for Molecular Property Prediction
Wang, Sheng
Guo, Yuzhi
Wang, Yuhong
Sun, Hongmao
Huang, Junzhou
ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 429 - 436
[49] LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval
Luo, Ziyang
Zhao, Pu
Xu, Can
Geng, Xiubo
Shen, Tao
Tao, Chongyang
Ma, Jing
Lin, Qingwei
Jiang, Daxin
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11172 - 11183
[50] Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Kanda, Naoyuki
Ye, Guoli
Wu, Yu
Gaur, Yashesh
Wang, Xiaofei
Meng, Zhong
Chen, Zhuo
Yoshioka, Takuya
INTERSPEECH 2021, 2021, : 3430 - 3434

← 1 2 3 4 5 →