Gradient Estimation with Stochastic Softmax Tricks

被引:0
|
作者
Paulus, Max B. [1 ]
Choi, Dami [2 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Univ Toronto, Toronto, ON, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Gumbel-Max trick is the basis of many relaxed gradient estimators. These estimators are easy to implement and low variance, but the goal of scaling them comprehensively to large combinatorial distributions is still outstanding. Working within the perturbation model framework, we introduce stochastic softmax tricks, which generalize the Gumbel-Softmax trick to combinatorial spaces. Our framework is a unified perspective on existing relaxed estimators for perturbation models, and it contains many novel relaxations. We design structured relaxations for subset selection, spanning trees, arborescences, and others. When compared to less structured baselines, we find that stochastic softmax tricks can be used to train latent variable models that perform better and discover more latent structure.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Escaping the Gradient Vanishing: Periodic Alternatives of Softmax in Attention Mechanism
    Wang, Shulun
    Liu, Feng
    Liu, Bin
    IEEE ACCESS, 2021, 9 : 168749 - 168759
  • [32] Stop-Gradient Softmax Loss for Deep Metric Learning
    Yang, Lu
    Wang, Peng
    Zhang, Yanning
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 3, 2023, : 3164 - 3172
  • [33] Cold-Start Reinforcement Learning with Softmax Policy Gradient
    Ding, Nan
    Soricut, Radu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [34] ONLINE PARAMETER ESTIMATION OF HYDRAULIC SYSTEM BASED ON STOCHASTIC GRADIENT DESCENT
    Yamada, Takashi
    Howard, Matthew
    PROCEEDINGS OF THE BATH/ASME 2020 SYMPOSIUM ON FLUID POWER AND MOTION CONTROL (FPMC2020), 2020,
  • [35] Solving Bayesian risk optimization via nested stochastic gradient estimation
    Cakmak, Sait
    Wu, Di
    Zhou, Enlu
    IISE TRANSACTIONS, 2021, 53 (10) : 1081 - 1093
  • [36] Efficient preconditioned stochastic gradient descent for estimation in latent variable models
    Baey, Charlotte
    Delattre, Maud
    Kuhn, Estelle
    Leger, Jean-Benoist
    Lemler, Sarah
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [37] Distributionally Constrained Black-Box Stochastic Gradient Estimation and Optimization
    Lam, Henry
    Zhang, Junhui
    OPERATIONS RESEARCH, 2024,
  • [38] Parametric estimation of stochastic differential equations via online gradient descent
    Nakakita, Shogo
    JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2024,
  • [39] Online distribution system state estimation via stochastic gradient algorithm*
    Huang, Jianqiao
    Zhou, Xinyang
    Cui, Bai
    ELECTRIC POWER SYSTEMS RESEARCH, 2022, 213
  • [40] Nonasymptotic Estimation of Risk Measures Using Stochastic Gradient Langevin Dynamics
    Chu, Jiarui
    Tangpi, Ludovic
    SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2024, 15 (02): : 503 - 536