Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent

被引:0
|
作者
Li, Zhiyuan [1 ]
Wang, Tianhao [2 ]
Lee, Jason D. [1 ]
Arora, Sanjeev [1 ]
机构
[1] Princeton Univ, Princeton, NJ 08540 USA
[2] Yale Univ, New Haven, CT 06511 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As part of the effort to understand implicit bias of gradient descent in overparametrized models, several results have shown how the training trajectory on the overparametrized model can be understood as mirror descent on a different objective. The main result here is a characterization of this phenomenon under a notion termed commuting parametrization, which encompasses all the previous results in this setting. It is shown that gradient flow with any commuting parametrization is equivalent to continuous mirror descent with a related Legendre function. Conversely, continuous mirror descent with any Legendre function can be viewed as gradient flow with a related commuting parametrization. The latter result relies upon Nash's embedding theorem.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Lost in the mirror: A descent into depersonalization
    Malviya, Nayan
    Patel, Natasha
    Mehta, Varun S.
    INDIAN JOURNAL OF PSYCHIATRY, 2025, 67 : S204 - S204
  • [42] Hessian Informed Mirror Descent
    Li Wang
    Ming Yan
    Journal of Scientific Computing, 2022, 92
  • [43] Convergence of online mirror descent
    Lei, Yunwen
    Zhou, Ding-Xuan
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2020, 48 (01) : 343 - 373
  • [44] SURVEY DESCENT: A MULTIPOINT GENERALIZATION OF GRADIENT DESCENT FOR NONSMOOTH OPTIMIZATION
    Han, X. Y.
    Lewis, Adrian S.
    SIAM JOURNAL ON OPTIMIZATION, 2023, 33 (01) : 36 - 62
  • [45] Adaptive Importance Sampling meets Mirror Descent: a Bias-Variance Tradeoff
    Korba, Anna
    Portier, Francois
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [46] Implicit Stochastic Gradient Descent Method for Cross-Domain Recommendation System
    Vo, Nam D.
    Hong, Minsung
    Jung, Jason J.
    SENSORS, 2020, 20 (09)
  • [47] Federated Learning with Class Balanced Loss Optimized by Implicit Stochastic Gradient Descent
    Zhou, Jincheng
    Zheng, Maoxing
    SOFT COMPUTING IN DATA SCIENCE, SCDS 2023, 2023, 1771 : 121 - 135
  • [48] Implicit Stochastic Gradient Descent for Training Physics-Informed Neural Networks
    Li, Ye
    Chen, Song-Can
    Huang, Sheng-Jun
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8692 - 8700
  • [49] Distributed Randomized Gradient-Free Mirror Descent Algorithm for Constrained Optimization
    Yu, Zhan
    Ho, Daniel W. C.
    Yuan, Deming
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (02) : 957 - 964
  • [50] ANALYSIS OF KINETIC MODELS FOR LABEL SWITCHING AND STOCHASTIC GRADIENT DESCENT
    Burger, Martin
    Rossi, Alex
    KINETIC AND RELATED MODELS, 2023, 16 (05) : 717 - 747