Diverse Weight Averaging for Out-of-Distribution Generalization

被引:0
|
作者
Rame, Alexandre [1 ]
Kirchmeyer, Matthieu [1 ,2 ]
Rahier, Thibaud [2 ]
Rakotomamonjy, Alain [2 ,4 ]
Gallinari, Patrick [1 ,2 ]
Cord, Matthieu [1 ,3 ]
机构
[1] Sorbonne Univ, CNRS, ISIR, F-75005 Paris, France
[2] Criteo AI Lab, Paris, France
[3] Valeo Ai, Paris, France
[4] Univ Rouen, LITIS, Rouen, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Standard neural networks struggle to generalize under distribution shifts in computer vision. Fortunately, combining multiple networks can consistently improve out-of-distribution generalization. In particular, weight averaging (WA) strategies were shown to perform best on the competitive DomainBed benchmark; they directly average the weights of multiple networks despite their nonlinearities. In this paper, we propose Diverse Weight Averaging (DiWA), a new WA strategy whose main motivation is to increase the functional diversity across averaged models. To this end, DiWA averages weights obtained from several independent training runs: indeed, models obtained from different runs are more diverse than those collected along a single run thanks to differences in hyperparameters and training procedures. We motivate the need for diversity by a new bias-variance-covariance-locality decomposition of the expected error, exploiting similarities between WA and standard functional ensembling. Moreover, this decomposition highlights that WA succeeds when the variance term dominates, which we show occurs when the marginal distribution changes at test time. Experimentally, DiWA consistently improves the state of the art on DomainBed without inference overhead.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Learning Causally Invariant Representations for Out-of-Distribution Generalization on Graphs
    Chen, Yongqiang
    Zhang, Yonggang
    Bian, Yatao
    Yang, Han
    Ma, Kaili
    Xie, Binghui
    Liu, Tongliang
    Han, Bo
    Cheng, James
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [42] SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning
    Nguyen, Bac
    Uhlich, Stefan
    Cardinaux, Fabien
    Mauch, Lukas
    Edraki, Marzieh
    Courville, Aaron
    COMPUTER VISION - ECCV 2024, PT LXIX, 2025, 15127 : 138 - 154
  • [43] Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization
    Ahuja, Kartik
    Caballero, Ethan
    Zhang, Dinghuai
    Gagnon-Audet, Jean-Christophe
    Bengio, Yoshua
    Mitliagkas, Ioannis
    Rish, Irina
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [44] Graph out-of-distribution generalization through contrastive learning paradigm
    Du, Hongyi
    Li, Xuewei
    Shao, Minglai
    KNOWLEDGE-BASED SYSTEMS, 2025, 315
  • [45] Functional Indirection Neural Estimator for Better Out-of-distribution Generalization
    Pham, Kha
    Le, Hung
    Ngo, Man
    Tran, Truyen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [46] Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization
    Miller, John
    Taori, Rohan
    Raghunathan, Aditi
    Sagawa, Shiori
    Koh, Pang Wei
    Shankar, Vaishaal
    Liang, Percy
    Carmon, Yair
    Schmidt, Ludwig
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [47] Out-of-distribution Detection Learning with Unreliable Out-of-distribution Sources
    Zheng, Haotian
    Wang, Qizhou
    Fang, Zhen
    Xia, Xiaobo
    Liu, Feng
    Liu, Tongliang
    Han, Bo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] How Image Corruption and Perturbation Affect Out-of-Distribution Generalization and Calibration
    Tada, Keigo
    Naganuma, Hiroki
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [49] Secure Out-of-Distribution Task Generalization with Energy-Based Models
    Chen, Shengzhuang
    Huang, Long-Kai
    Schwarz, Jonathan Richard
    Du, Yilun
    Wei, Ying
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [50] Unlock the Potential of Counterfactually-Augmented Data in Out-Of-Distribution Generalization
    Fan, Caoyun
    Chen, Wenqing
    Tian, Jidong
    Li, Yitian
    He, Hao
    Jin, Yaohui
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238