Diverse Weight Averaging for Out-of-Distribution Generalization

被引:0
|
作者
Rame, Alexandre [1 ]
Kirchmeyer, Matthieu [1 ,2 ]
Rahier, Thibaud [2 ]
Rakotomamonjy, Alain [2 ,4 ]
Gallinari, Patrick [1 ,2 ]
Cord, Matthieu [1 ,3 ]
机构
[1] Sorbonne Univ, CNRS, ISIR, F-75005 Paris, France
[2] Criteo AI Lab, Paris, France
[3] Valeo Ai, Paris, France
[4] Univ Rouen, LITIS, Rouen, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Standard neural networks struggle to generalize under distribution shifts in computer vision. Fortunately, combining multiple networks can consistently improve out-of-distribution generalization. In particular, weight averaging (WA) strategies were shown to perform best on the competitive DomainBed benchmark; they directly average the weights of multiple networks despite their nonlinearities. In this paper, we propose Diverse Weight Averaging (DiWA), a new WA strategy whose main motivation is to increase the functional diversity across averaged models. To this end, DiWA averages weights obtained from several independent training runs: indeed, models obtained from different runs are more diverse than those collected along a single run thanks to differences in hyperparameters and training procedures. We motivate the need for diversity by a new bias-variance-covariance-locality decomposition of the expected error, exploiting similarities between WA and standard functional ensembling. Moreover, this decomposition highlights that WA succeeds when the variance term dominates, which we show occurs when the marginal distribution changes at test time. Experimentally, DiWA consistently improves the state of the art on DomainBed without inference overhead.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Supervision Adaptation Balancing In-Distribution Generalization and Out-of-Distribution Detection
    Zhao, Zhilin
    Cao, Longbing
    Lin, Kun-Yu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15743 - 15758
  • [32] Understanding the Generalization of Pretrained Diffusion Models on Out-of-Distribution Data
    Ramachandran, Sai Niranjan
    Mukhopadhyay, Rudrabha
    Agarwal, Madhav
    Jawahar, C. V.
    Namboodiri, Vinay
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14767 - 14775
  • [33] Out-of-Distribution Generalization by Neural-Symbolic Joint Training
    Liu, Anji
    Xu, Hongming
    Van den Broeck, Guy
    Liang, Yitao
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 10, 2023, : 12252 - 12259
  • [34] An Out-of-Distribution Generalization Framework Based on Variational Backdoor Adjustment
    Su, Hang
    Wang, Wei
    MATHEMATICS, 2024, 12 (01)
  • [35] Targeted Data-driven Regularization for Out-of-Distribution Generalization
    Kamani, Mohammad Mahdi
    Farhang, Sadegh
    Mahdavi, Mehrdad
    Wang, James Z.
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 882 - 891
  • [36] The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
    Hendrycks, Dan
    Basart, Steven
    Mu, Norman
    Kadavath, Saurav
    Wang, Frank
    Dorundo, Evan
    Desai, Rahul
    Zhu, Tyler
    Parajuli, Samyak
    Guo, Mike
    Song, Dawn
    Steinhardt, Jacob
    Gilmer, Justin
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8320 - 8329
  • [37] Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors
    Wang, Qixun
    Wang, Yifei
    Zhu, Hong
    Wang, Yisen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [38] Individual and Structural Graph Information Bottlenecks for Out-of-Distribution Generalization
    Yang, Ling
    Zheng, Jiayi
    Wang, Heyuan
    Liu, Zhongyi
    Huang, Zhilin
    Hong, Shenda
    Zhang, Wentao
    Cui, Bin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (02) : 682 - 693
  • [39] Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning
    Ada, Suzan Ece
    Oztop, Erhan
    Ugur, Emre
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3116 - 3123
  • [40] A Multimodal AI System for Out-of-Distribution Generalization of Seizure Identification
    Yang, Yikai
    Nhan Duy Truong
    Eshraghian, Jason K.
    Maher, Christina
    Nikpour, Armin
    Kavehei, Omid
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (07) : 3529 - 3538