WeaSuLπ: Weakly Supervised Dialogue Policy Learning: Reward Estimation for Multi-turn Dialogue

被引:0
|
作者
Khandelwal, Anant [1 ]
机构
[1] Amazon, India Machine Learning, Bangalore, Karnataka, India
关键词
MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An intelligent dialogue system in a multi-turn setting should not only generate the responses which are of good quality, but it should also generate the responses which can lead to long-term success of the dialogue. Although, the current approaches improved the response quality, but they over-look the training signals present in the dialogue data. We can leverage these signals to generate the weakly supervised training data for learning dialog policy and reward estimator, and make the policy take actions (generates responses) which can foresee the future direction for a successful (rewarding) conversation. We simulate the dialogue between an agent and a user (modelled similar to an agent with supervised learning objective) to interact with each other. The agent uses dynamic blocking to generate ranked diverse responses and explorationexploitation to select among the Top-K responses. Each simulated state-action pair is evaluated (works as a weak annotation) with three quality modules: Semantic Relevant, Semantic Coherence and Consistent Flow. Empirical studies with two benchmarks indicate that our model can significantly out-perform the response quality and lead to a successful conversation on both automatic evaluation and human judgment.(1)
引用
收藏
页码:69 / 80
页数:12
相关论文
共 50 条
  • [1] Multi-turn Dialogue Generation Model with Dialogue Structure
    Jiang X.-T.
    Wang Z.-Q.
    Li S.-S.
    Zhou G.-D.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (11): : 4239 - 4250
  • [2] Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation
    Huang, Xinting
    Qi, Jianzhong
    Sun, Yu
    Zhang, Rui
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 660 - 670
  • [3] Multi-turn Dialogue Response Generation in an Adversarial Learning Framework
    Olabiyi, Oluwatobi
    Salimov, Alan
    Khazane, Anish
    Mueller, Erik T.
    NLP FOR CONVERSATIONAL AI, 2019, : 121 - 132
  • [4] Multi-Level Curriculum Learning for Multi-Turn Dialogue Generation
    Chen, Guanhua
    Zhan, Runzhe
    Wong, Derek F.
    Chao, Lidia S.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3958 - 3967
  • [5] MuTual: A Dataset for Multi-Turn Dialogue Reasoning
    Cui, Leyang
    Wu, Yu
    Liu, Shujie
    Zhang, Yue
    Zhou, Ming
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1406 - 1416
  • [6] Model of Multi-turn Dialogue in Emotional Chatbot
    Kao, Chien-Hao
    Chen, Chih-Chieh
    Tsai, Yu-Tza
    2019 INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2019,
  • [7] Multi-Turn Dialogue Agent as Sales' Assistant in Telemarketing
    Gao, Wanting
    Gao, Xinyi
    Tang, Yin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [8] Modeling Topical Relevance for Multi-Turn Dialogue Generation
    Zhang, Hainan
    Lan, Yanyan
    Pang, Liang
    Chen, Hongshen
    Ding, Zhuoye
    Yin, Dawei
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3737 - 3743
  • [9] EmoEM: Emotional Expression in a Multi-turn Dialogue Model
    Zhang, Ao
    Wu, Shaojuan
    Zhang, Xiaowang
    Chen, Shizhan
    Shu, Yuchun
    Feng, Zhiyong
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 496 - 501
  • [10] Improving Multi-turn Dialogue Modelling with Utterance ReWriter
    Su, Hui
    Shen, Xiaoyu
    Zhang, Rongzhi
    Sun, Fei
    Hu, Pengwei
    Niu, Cheng
    Zhou, Jie
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 22 - 31