Gradient Estimation with Stochastic Softmax Tricks

被引:0
|
作者
Paulus, Max B. [1 ]
Choi, Dami [2 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Univ Toronto, Toronto, ON, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Gumbel-Max trick is the basis of many relaxed gradient estimators. These estimators are easy to implement and low variance, but the goal of scaling them comprehensively to large combinatorial distributions is still outstanding. Working within the perturbation model framework, we introduce stochastic softmax tricks, which generalize the Gumbel-Softmax trick to combinatorial spaces. Our framework is a unified perspective on existing relaxed estimators for perturbation models, and it contains many novel relaxations. We design structured relaxations for subset selection, spanning trees, arborescences, and others. When compared to less structured baselines, we find that stochastic softmax tricks can be used to train latent variable models that perform better and discover more latent structure.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Stochastic gradient descent tricks
    Bottou, Léon
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 421 - 436
  • [2] Softmax-kernel reproduced gradient descent for stochastic optimization on data
    Lin, Yifu
    Li, Wenling
    Liu, Yang
    Song, Jia
    SIGNAL PROCESSING, 2025, 231
  • [3] On Biased Stochastic Gradient Estimation
    Driggs, Derek
    Liang, Jingwei
    Schonlieb, Carola-Bibiane
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [4] Fast and Accurate Stochastic Gradient Estimation
    Chen, Beidi
    Xu, Yingchen
    Shrivastava, Anshumali
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [5] POLICY EVALUATION WITH STOCHASTIC GRADIENT ESTIMATION TECHNIQUES
    Zhou, Yi
    Fu, Michael C.
    Ryzhov, Ilya O.
    2022 WINTER SIMULATION CONFERENCE (WSC), 2022, : 3039 - 3050
  • [6] Gradient Estimation Using Stochastic Computation Graphs
    Schulman, John
    Heess, Nicolas
    Weber, Theophane
    Abbeel, Pieter
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [7] LIKELIHOOD RATIO GRADIENT ESTIMATION FOR STOCHASTIC RECURSIONS
    GLYNN, PW
    LECUYER, P
    ADVANCES IN APPLIED PROBABILITY, 1995, 27 (04) : 1019 - 1053
  • [8] ERROR OF ESTIMATION OF DIRECTION IN STOCHASTIC GRADIENT PROBLEMS
    BRAITSEV, NA
    GOLOVIN, VN
    KIRILLIN, VV
    INDUSTRIAL LABORATORY, 1978, 44 (10): : 1410 - 1411
  • [9] Gradient-Enhanced Softmax for Face Recognition
    Sun, Linjun
    Li, Weijun
    Ning, Xin
    Zhang, Liping
    Dong, Xiaoli
    He, Wei
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (05) : 1185 - 1189
  • [10] Gradient-enhanced softmax for face recognition
    Sun, Linjun
    Li, Weijun
    Ning, Xin
    Zhang, Liping
    Dong, Xiaoli
    He, Wei
    IEICE Transactions on Information and Systems, 2020, E103D (05): : 1185 - 1189