Diverse Weight Averaging for Out-of-Distribution Generalization

被引:0
|
作者
Rame, Alexandre [1 ]
Kirchmeyer, Matthieu [1 ,2 ]
Rahier, Thibaud [2 ]
Rakotomamonjy, Alain [2 ,4 ]
Gallinari, Patrick [1 ,2 ]
Cord, Matthieu [1 ,3 ]
机构
[1] Sorbonne Univ, CNRS, ISIR, F-75005 Paris, France
[2] Criteo AI Lab, Paris, France
[3] Valeo Ai, Paris, France
[4] Univ Rouen, LITIS, Rouen, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Standard neural networks struggle to generalize under distribution shifts in computer vision. Fortunately, combining multiple networks can consistently improve out-of-distribution generalization. In particular, weight averaging (WA) strategies were shown to perform best on the competitive DomainBed benchmark; they directly average the weights of multiple networks despite their nonlinearities. In this paper, we propose Diverse Weight Averaging (DiWA), a new WA strategy whose main motivation is to increase the functional diversity across averaged models. To this end, DiWA averages weights obtained from several independent training runs: indeed, models obtained from different runs are more diverse than those collected along a single run thanks to differences in hyperparameters and training procedures. We motivate the need for diversity by a new bias-variance-covariance-locality decomposition of the expected error, exploiting similarities between WA and standard functional ensembling. Moreover, this decomposition highlights that WA succeeds when the variance term dominates, which we show occurs when the marginal distribution changes at test time. Experimentally, DiWA consistently improves the state of the art on DomainBed without inference overhead.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Certifiable Out-of-Distribution Generalization
    Ye, Nanyang
    Zhu, Lin
    Wang, Jia
    Zeng, Zhaoyu
    Shao, Jiayao
    Peng, Chensheng
    Pan, Bikang
    Li, Kaican
    Zhu, Jun
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10927 - 10935
  • [2] Out-of-Distribution Generalization in Kernel Regression
    Canatar, Abdulkadir
    Bordelon, Blake
    Pehlevan, Cengiz
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Causal softmax for out-of-distribution generalization
    Luo, Jing
    Zhao, Wanqing
    Peng, Jinye
    DIGITAL SIGNAL PROCESSING, 2025, 156
  • [4] Out-of-distribution generalization for learning quantum dynamics
    Caro, Matthias C.
    Huang, Hsin-Yuan
    Ezzell, Nicholas
    Gibbs, Joe
    Sornborger, Andrew T.
    Cincio, Lukasz
    Coles, Patrick J.
    Holmes, Zoe
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [5] On the Adversarial Robustness of Out-of-distribution Generalization Models
    Zou, Xin
    Liu, Weiwei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] On the Out-of-distribution Generalization of Probabilistic Image Modelling
    Zhang, Mingtian
    Zhang, Andi
    McDonagh, Steven
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Assaying Out-Of-Distribution Generalization in Transfer Learning
    Wenzel, Florian
    Dittadi, Andrea
    Gehler, Peter
    Simon-Gabriel, Carl-Johann
    Horn, Max
    Zietlow, Dominik
    Kernert, David
    Russell, Chris
    Brox, Thomas
    Schiele, Bernt
    Scholkopf, Bernhard
    Locatello, Francesco
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] Out-of-distribution Generalization and Its Applications for Multimedia
    Wang, Xin
    Cui, Peng
    Zhu, Wenwu
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5681 - 5682
  • [9] Out-of-Distribution Generalization With Causal Feature Separation
    Wang, Haotian
    Kuang, Kun
    Lan, Long
    Wang, Zige
    Huang, Wanrong
    Wu, Fei
    Yang, Wenjing
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1758 - 1772
  • [10] A Stable Vision Transformer for Out-of-Distribution Generalization
    Yu, Haoran
    Liu, Baodi
    Wang, Yingjie
    Zhang, Kai
    Tao, Dapeng
    Liu, Weifeng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII, 2024, 14432 : 328 - 339