Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning

被引:0
|
作者
Li, Ziming [1 ]
Kiseleva, Julia [1 ,2 ]
de Rijke, Maarten [1 ]
机构
[1] Univ Amsterdam, Amsterdam, Netherlands
[2] Microsoft Res AI, Redmond, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.
引用
收藏
页码:6722 / 6729
页数:8
相关论文
共 50 条
  • [1] Approximate Inverse Reinforcement Learning from Vision-based Imitation Learning
    Lee, Keuntaek
    Vlahov, Bogdan
    Gibson, Jason
    Rehg, James M.
    Theodorou, Evangelos A.
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 10793 - 10799
  • [2] Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning
    Piot, Bilal
    Geist, Matthieu
    Pietquin, Olivier
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (08) : 1814 - 1826
  • [3] Methodologies for Imitation Learning via Inverse Reinforcement Learning: A Review
    Zhang K.
    Yu Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (02): : 254 - 261
  • [4] An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase Generation
    Du, Wanyu
    Ji, Yangfeng
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 6012 - 6018
  • [5] Semantic Guidance of Dialogue Generation with Reinforcement Learning
    Hsueh, Cheng-Hsun
    Ma, Wei-Yun
    SIGDIAL 2020: 21ST ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2020), 2020, : 1 - 9
  • [6] Robust Imitation via Mirror Descent Inverse Reinforcement Learning
    Han, Dong-Sig
    Kim, Hyunseo
    Lee, Hyundo
    Ryu, Je-Hwan
    Zhang, Byoung-Tak
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [7] Modeling the Development of Infant Imitation using Inverse Reinforcement Learning
    Tekden, Ahmet E.
    Ugur, Emre
    Nagai, Yukie
    Oztop, Erhan
    2018 JOINT IEEE 8TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (ICDL-EPIROB), 2018, : 155 - 160
  • [8] Imitation and reinforcement learning
    Kober J.
    Peters J.
    IEEE Robotics and Automation Magazine, 2010, 17 (02): : 55 - 62
  • [9] User Simulation in Dialogue Systems using Inverse Reinforcement Learning
    Chandramohan, Senthilkumar
    Geist, Matthieu
    Lefevre, Fabrice
    Pietquin, Olivier
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1032 - +
  • [10] Reinforcement Learning for Dialogue Generation: A Systematic Literature Review
    Kanwal, Sameera
    Farooq, Muhammad Shoaib
    4TH INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING (IC)2, 2021, : 519 - 528