Scaling Backwards: Minimal Synthetic Pre-Training?

被引:0
|
作者
Nakamura, Ryo [1 ]
Tadokoro, Ryu [2 ]
Yamada, Ryosuke [1 ]
Asano, Yuki M. [3 ]
Laina, Iro [4 ]
Rupprecht, Christian [4 ]
Inoue, Nakamasa [5 ]
Yokota, Rio [5 ]
Kataoka, Hirokatsu [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
[2] Tohoku Univ, Sendai, Miyagi, Japan
[3] Univ Amsterdam, Amsterdam, Netherlands
[4] Univ Oxford, Oxford, England
[5] Tokyo Inst Technol, Meguro, Japan
来源
关键词
Synthetic pre-training; Limited data; Vision transformers;
D O I
10.1007/978-3-031-72633-0_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-training and transfer learning are an important building block of current computer vision systems. While pre-training is usually performed on large real-world image datasets, in this paper we ask whether this is truly necessary. To this end, we search for a minimal, purely synthetic pre-training dataset that allows us to achieve performance similar to the 1 million images of ImageNet-1k. We construct such a dataset from a single fractal with perturbations. With this, we contribute three main findings. (i) We show that pre-training is effective even with minimal synthetic images, with performance on par with large-scale pre-training datasets like ImageNet-1k for full fine-tuning. (ii) We investigate the single parameter with which we construct artificial categories for our dataset. We find that while the shape differences can be indistinguishable to humans, they are crucial for obtaining strong performances. (iii) Finally, we investigate the minimal requirements for successful pre-training. Surprisingly, we find that a substantial reduction of synthetic images from 1k to 1 can even lead to an increase in pre-training performance, a motivation to further investigate "scaling backwards". Finally, we extend our method from synthetic images to real images to see if a single real image can show similar pre-training effect through shape augmentation. We find that the use of grayscale images and affine transformations allows even real images to "scale backwards". The code is available at https://github.com/SUPERTADORY/1p- frac.
引用
收藏
页码:153 / 171
页数:19
相关论文
共 50 条
  • [21] Pre-Training to Learn in Context
    Gu, Yuxian
    Dong, Li
    Wei, Furu
    Huang, Minlie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4849 - 4870
  • [22] THE PRE-TRAINING SELECTION OF TEACHERS
    Barr, A. S.
    Douglas, Lois
    JOURNAL OF EDUCATIONAL RESEARCH, 1934, 28 (02): : 92 - 117
  • [23] Improving Fractal Pre-training
    Anderson, Connor
    Farrell, Ryan
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2412 - 2421
  • [24] Pre-training via Paraphrasing
    Lewis, Mike
    Ghazvininejad, Marjan
    Ghosh, Gargi
    Aghajanyan, Armen
    Wang, Sida
    Zettlemoyer, Luke
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [25] Pre-training phenotyping classifiers
    Dligach, Dmitriy
    Afshar, Majid
    Miller, Timothy
    JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 113 (113)
  • [26] GENERATION OF SYNTHETIC STRUCTURAL MAGNETIC RESONANCE IMAGES FOR DEEP LEARNING PRE-TRAINING
    Castro, Eduardo
    Ulloa, Alvaro
    Plis, Sergey M.
    Turner, Jessica A.
    Calhoun, Vince D.
    2015 IEEE 12TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2015, : 1057 - 1060
  • [27] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
    Yuan, Hangjie
    Zhang, Shiwei
    Wang, Xiang
    Albanie, Samuel
    Pan, Yining
    Feng, Tao
    Jiang, Jianwen
    Ni, Dong
    Zhang, Yingya
    Zhao, Deli
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21592 - 21604
  • [28] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
    Yuan, Hangjie
    Zhang, Shiwei
    Wang, Xiang
    Albanie, Samuel
    Pan, Yining
    Feng, Tao
    Jiang, Jianwen
    Ni, Dong
    Zhang, Yingya
    Zhao, Deli
    Proceedings of the IEEE International Conference on Computer Vision, 2023, : 21592 - 21604
  • [29] Rethinking Pre-training and Self-training
    Zoph, Barret
    Ghiasi, Golnaz
    Lin, Tsung-Yi
    Cui, Yin
    Liu, Hanxiao
    Cubuk, Ekin D.
    Le, Quoc V.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [30] Pre-Training Without Natural Images
    Kataoka, Hirokatsu
    Okayasu, Kazushige
    Matsumoto, Asato
    Yamagata, Eisuke
    Yamada, Ryosuke
    Inoue, Nakamasa
    Nakamura, Akio
    Satoh, Yutaka
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (04) : 990 - 1007