On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition

被引:16
|
作者
Bai, Ching-Yuan [1 ]
Lin, Hsuan-Tien [1 ]
Raffel, Colin [2 ]
Kan, Wendy Chih-wen [2 ]
机构
[1] Natl Taiwan Univ, Comp Sci & Informat Engn, Taipei, Taiwan
[2] Google, Mountain View, CA 94043 USA
关键词
benchmark; competition; neural networks; generative models; memorization; datasets; computer vision;
D O I
10.1145/3447548.3467198
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many recent developments on generative models for natural images have relied on heuristically-motivated metrics that can be easily gamed by memorizing a small sample from the true distribution or training a model directly to improve the metric. In this work, we critically evaluate the gameability of these metrics by designing and deploying a generative modeling competition. Our competition received over 11000 submitted models. The competitiveness between participants allowed us to investigate both intentional and unintentional memorization in generative modeling. To detect intentional memorization, we propose the "Memorization-Informed Frechet Inception Distance" (MiFID) as a new memorization-aware metric and design benchmark procedures to ensure that winning submissions made genuine improvements in perceptual quality. Furthermore, we manually inspect the code for the 1000 top-performing models to understand and label different forms of memorization. Our analysis reveals that unintentional memorization is a serious and common issue in popular generative models. The generated images and our memorization labels of those models as well as code to compute MiFID are released to facilitate future studies on benchmarking generative models.
引用
收藏
页码:2534 / 2542
页数:9
相关论文
共 50 条
  • [21] Generative design of a large-scale nonhomogeneous structures
    Djokikj, Jelena
    Jovanova, Jovana
    IFAC PAPERSONLINE, 2021, 54 (01): : 773 - 779
  • [22] Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish
    Ekgren, Ariel
    Gyllensten, Amaru Cuba
    Gogoulou, Evangelia
    Heiman, Alice
    Verlinden, Severine
    Ohman, Joey
    Carlsson, Fredrik
    Sahlgren, Magnus
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3509 - 3518
  • [23] Groups at work: Lessons from research into large-scale coordination
    Nemeth C.P.
    Cognition, Technology & Work, 2007, 9 (1) : 1 - 4
  • [24] Lessons from a large-scale assessment: Results from conceptual inventories
    Thacker, Beth
    Dulli, Hani
    Pattillo, Dave
    West, Keith
    PHYSICAL REVIEW SPECIAL TOPICS-PHYSICS EDUCATION RESEARCH, 2014, 10 (02):
  • [25] LESSONS LEARNED FROM FLIPPING A LARGE-SCALE PROGRAMMING COURSE
    Wilson, S.
    11TH INTERNATIONAL CONFERENCE OF EDUCATION, RESEARCH AND INNOVATION (ICERI2018), 2018, : 3594 - 3601
  • [26] Lessons Learned from Large-Scale Aerospace Structural Testing
    Lovejoy, Andrew E.
    Jegley, Dawn C.
    Hilburger, Mark W.
    Przekop, Adam
    AIAA JOURNAL, 2023, 61 (11) : 5110 - 5120
  • [27] The human genome project: Lessons from large-scale biology
    Collins, FS
    Morgan, M
    Patrinos, A
    SCIENCE, 2003, 300 (5617) : 286 - 290
  • [28] Interaction networks: Lessons from large-scale studies in yeast
    Cagney, Gerard
    PROTEOMICS, 2009, 9 (20) : 4799 - 4811
  • [29] A forecast for large-scale, predictive biology: Lessons from meteorology
    Covert, Markus W.
    Gillies, Taryn E.
    Kudo, Takamasa
    Agmon, Eran
    CELL SYSTEMS, 2021, 12 (06) : 488 - 496
  • [30] Benchmarking large-scale data management for Internet of Things
    Hendawi, Abdeltawab
    Gupta, Jayant
    Liu, Jiayi
    Teredesai, Ankur
    Ramakrishnan, Naveen
    Shah, Mohak
    El-Sappagh, Shaker
    Kwak, Kyung-Sup
    Ali, Mohamed
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (12): : 8207 - 8230