Diverse Weight Averaging for Out-of-Distribution Generalization

被引:0
|
作者
Rame, Alexandre [1 ]
Kirchmeyer, Matthieu [1 ,2 ]
Rahier, Thibaud [2 ]
Rakotomamonjy, Alain [2 ,4 ]
Gallinari, Patrick [1 ,2 ]
Cord, Matthieu [1 ,3 ]
机构
[1] Sorbonne Univ, CNRS, ISIR, F-75005 Paris, France
[2] Criteo AI Lab, Paris, France
[3] Valeo Ai, Paris, France
[4] Univ Rouen, LITIS, Rouen, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Standard neural networks struggle to generalize under distribution shifts in computer vision. Fortunately, combining multiple networks can consistently improve out-of-distribution generalization. In particular, weight averaging (WA) strategies were shown to perform best on the competitive DomainBed benchmark; they directly average the weights of multiple networks despite their nonlinearities. In this paper, we propose Diverse Weight Averaging (DiWA), a new WA strategy whose main motivation is to increase the functional diversity across averaged models. To this end, DiWA averages weights obtained from several independent training runs: indeed, models obtained from different runs are more diverse than those collected along a single run thanks to differences in hyperparameters and training procedures. We motivate the need for diversity by a new bias-variance-covariance-locality decomposition of the expected error, exploiting similarities between WA and standard functional ensembling. Moreover, this decomposition highlights that WA succeeds when the variance term dominates, which we show occurs when the marginal distribution changes at test time. Experimentally, DiWA consistently improves the state of the art on DomainBed without inference overhead.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Graph Out-of-Distribution Generalization With Controllable Data Augmentation
    Lu, Bin
    Zhao, Ze
    Gan, Xiaoying
    Liang, Shiyu
    Fu, Luoyi
    Wang, Xinbing
    Zhou, Chenghu
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (11) : 6317 - 6329
  • [22] Probing out-of-distribution generalization in machine learning for materials
    Li, Kangming
    Rubungo, Andre Niyongabo
    Lei, Xiangyun
    Persaud, Daniel
    Choudhary, Kamal
    Decost, Brian
    Dieng, Adji Bousso
    Hattrick-Simpers, Jason
    COMMUNICATIONS MATERIALS, 2025, 6 (01)
  • [23] Tackling Domain Generalization for Out-of-Distribution Endoscopic Imaging
    Ali Teevno, Mansoor
    Ochoa-Ruiz, Gilberto
    Ali, Sharib
    MACHINE LEARNING IN MEDICAL IMAGING, PT II, MLMI 2024, 2025, 15242 : 43 - 52
  • [24] RetroOOD: Understanding Out-of-Distribution Generalization in Retrosynthesis Prediction
    Yu, Yemin
    Yuan, Luotian
    Wei, Ying
    Gao, Hanyu
    Wu, Fei
    Wang, Zhihua
    Ye, Xinhai
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 374 - 382
  • [25] Deep Relevant Feature Focusing for Out-of-Distribution Generalization
    Wang, Fawu
    Zhang, Kang
    Liu, Zhengyu
    Yuan, Xia
    Zhao, Chunxia
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, PRCV 2022, 2022, 13534 : 245 - 253
  • [26] Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?
    Zhang, Dinghuai
    Ahuja, Kartik
    Xu, Yilun
    Wang, Yisen
    Courville, Aaron
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [27] Understanding and Improving Feature Learning for Out-of-Distribution Generalization
    Chen, Yongqiang
    Huang, Wei
    Zhou, Kaiwen
    Bian, Yatao
    Han, Bo
    Cheng, James
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [28] Face Reconstruction Transfer Attack as Out-of-Distribution Generalization
    June, Yoon Gyo
    Park, Jaewoo
    Dong, Xingbo
    Park, Hojin
    Teoh, Andrew Beng Jin
    Camps, Octavia
    COMPUTER VISION - ECCV 2024, PT LXXV, 2025, 15133 : 396 - 413
  • [29] Learning Invariant Graph Representations for Out-of-Distribution Generalization
    Li, Haoyang
    Zhang, Ziwei
    Wang, Xin
    Zhu, Wenwu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [30] Fishr: Invariant Gradient Variances for Out-of-Distribution Generalization
    Rame, Alexandre
    Dancette, Corentin
    Cord, Matthieu
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,