Discrete Auto-regressive Variational Attention Models for Text Modeling

被引:0
|
作者
Fang, Xianghong [1 ]
Bai, Haoli [1 ]
Li, Jian [1 ]
Xu, Zenglin [2 ]
Lyu, Michael [1 ]
King, Irwin [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Engn, Shenzhen, Peoples R China
关键词
Text Modeling; Information Underrepresentation; Posterior Collapse;
D O I
10.1109/IJCNN52387.2021.9534375
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Variational autoencoders (VAEs) have been widely applied for text modeling. In practice, however, they are troubled by two challenges: information underrepresentation and posterior collapse. The former arises as only the last hidden state of LSTM encoder is transformed into the latent space, which is generally insufficient to summarize the data. The latter is a long-standing problem during the training of VAEs as the optimization is trapped to a disastrous local optimum. In this paper, we propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges. Specifically, we introduce an auto-regressive variational attention approach to enrich the latent space by effectively capturing the semantic dependency from the input. We further design discrete latent space for the variational attention and mathematically show that our model is free from posterior collapse. Extensive experiments on language modeling tasks demonstrate the superiority of DAVAM against several VAE counterparts. Code will be released.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Cutoff for a class of auto-regressive models with vanishing additive noise
    Gerencser, Balazs
    Ottolini, Andrea
    SCANDINAVIAN JOURNAL OF STATISTICS, 2025, 52 (01) : 314 - 331
  • [22] GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models
    You, Jiaxuan
    Ying, Rex
    Ren, Xiang
    Hamilton, William L.
    Leskovec, Jure
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [23] Heterogeneous Auto-Regressive Modeling based Realised Volatility Forecasting
    Avinash, G.
    Ramasubramanian, V
    Gopalakrishnan, Badri Narayanan
    STATISTICS AND APPLICATIONS, 2023, 21 (02): : 121 - 140
  • [24] Dissecting Recall of Factual Associations in Auto-Regressive Language Models
    Geval, Mor
    Bastings, Jasmijn
    Filippoval, Katja
    Globerson, Amir
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12216 - 12235
  • [25] Auto-regressive extractive summarization with replacement
    Zhu, Tianyu
    Hua, Wen
    Qu, Jianfeng
    Hosseini, Saeid
    Zhou, Xiaofang
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (04): : 2003 - 2026
  • [26] Locally adaptive spatial smoothing using conditional auto-regressive models
    Lee, Duncan
    Mitchell, Richard
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2013, 62 (04) : 593 - 608
  • [27] Estimation of the order of an auto-regressive model
    Rao, NS
    Moharir, PS
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 1995, 20 : 749 - 758
  • [28] Non-Parametric Sparse Additive Auto-Regressive Network Models
    Zhou, Hao Henry
    Raskutti, Garvesh
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (03) : 1473 - 1492
  • [29] A linear and nonlinear auto-regressive model and its application in modeling and forecasting
    Ma, Jiaxin
    Xu, Feiyun
    Huang, Ren
    Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2013, 43 (03): : 509 - 514
  • [30] Full-brain auto-regressive modeling (FARM) using fMRI
    Garg, Rahul
    Cecchi, Guillermo A.
    Rao, A. Ravishankar
    NEUROIMAGE, 2011, 58 (02) : 416 - 441