Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning

被引:0
|
作者
Li, Ziming [1 ]
Kiseleva, Julia [1 ,2 ]
de Rijke, Maarten [1 ]
机构
[1] Univ Amsterdam, Amsterdam, Netherlands
[2] Microsoft Res AI, Redmond, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.
引用
收藏
页码:6722 / 6729
页数:8
相关论文
共 50 条
  • [21] Inverse Reinforcement Learning for Trajectory Imitation Using Static Output Feedback Control
    Xue, Wenqian
    Lian, Bosen
    Fan, Jialu
    Chai, Tianyou
    Lewis, Frank L.
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (03) : 1695 - 1707
  • [22] Optimizing Crop Management with Reinforcement Learning and Imitation Learning
    Tao, Ran
    Zhao, Pan
    Wu, Jing
    Martin, Nicolas
    Harrison, Matthew T.
    Ferreira, Carla
    Kalantari, Zahra
    Hovakimyan, Naira
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6228 - 6236
  • [23] Learning to Drive Using Sparse Imitation Reinforcement Learning
    Han, Yuci
    Yilmaz, Alper
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3736 - 3742
  • [24] Learning Fairness from Demonstrations via Inverse Reinforcement Learning
    Blandin, Jack
    Kash, Ian
    PROCEEDINGS OF THE 2024 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, ACM FACCT 2024, 2024, : 51 - 61
  • [25] Learning from Demonstration for Shaping through Inverse Reinforcement Learning
    Suay, Halit Bener
    Brys, Tim
    Taylor, Matthew E.
    Chernova, Sonia
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 429 - 437
  • [26] Implicit imitation in multiagent reinforcement learning
    Price, B
    Boutilier, C
    MACHINE LEARNING, PROCEEDINGS, 1999, : 325 - 334
  • [27] Robotic Manipulation with Reinforcement Learning, State Representation Learning, and Imitation Learning
    Chen, Hanxiao
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15769 - 15770
  • [28] Inverse reinforcement learning from summary data
    Antti Kangasrääsiö
    Samuel Kaski
    Machine Learning, 2018, 107 : 1517 - 1535
  • [29] Inverse reinforcement learning from summary data
    Kangasraasio, Antti
    Kaski, Samuel
    MACHINE LEARNING, 2018, 107 (8-10) : 1517 - 1535
  • [30] Imitation Game: A Model-based and Imitation Learning Deep Reinforcement Learning Hybrid
    Veith, Eric Msp
    Logemann, Torben
    Berezin, Aleksandr
    Wellssow, Arlena
    Balduin, Stephan
    2024 12TH WORKSHOP ON MODELING AND SIMULATION OF CYBER-PHYSICAL ENERGY SYSTEMS, MSCPES, 2024,