WeaSuLπ: Weakly Supervised Dialogue Policy Learning: Reward Estimation for Multi-turn Dialogue

被引:0
|
作者
Khandelwal, Anant [1 ]
机构
[1] Amazon, India Machine Learning, Bangalore, Karnataka, India
关键词
MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An intelligent dialogue system in a multi-turn setting should not only generate the responses which are of good quality, but it should also generate the responses which can lead to long-term success of the dialogue. Although, the current approaches improved the response quality, but they over-look the training signals present in the dialogue data. We can leverage these signals to generate the weakly supervised training data for learning dialog policy and reward estimator, and make the policy take actions (generates responses) which can foresee the future direction for a successful (rewarding) conversation. We simulate the dialogue between an agent and a user (modelled similar to an agent with supervised learning objective) to interact with each other. The agent uses dynamic blocking to generate ranked diverse responses and explorationexploitation to select among the Top-K responses. Each simulated state-action pair is evaluated (works as a weak annotation) with three quality modules: Semantic Relevant, Semantic Coherence and Consistent Flow. Empirical studies with two benchmarks indicate that our model can significantly out-perform the response quality and lead to a successful conversation on both automatic evaluation and human judgment.(1)
引用
收藏
页码:69 / 80
页数:12
相关论文
共 50 条
  • [21] Semantic Role Labeling Guided Multi-turn Dialogue ReWriter
    Xu, Kun
    Tan, Haochen
    Song, Linfeng
    Wu, Han
    Zhang, Haisong
    Song, Linqi
    Yu, Dong
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6632 - 6639
  • [22] Envisioning Future from the Past: Hierarchical Duality Learning for Multi-Turn Dialogue Generation
    Lv, Ang
    Li, Jinpeng
    Xie, Shufang
    Yan, Rui
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7382 - 7394
  • [23] Human-Machine Multi-Turn Language Dialogue Interaction Based on Deep Learning
    Ke, Xianxin
    Hu, Ping
    Yang, Chenghao
    Zhang, Renbao
    MICROMACHINES, 2022, 13 (03)
  • [24] Multi-turn dialogue-oriented pretrained question generation model
    Yanmeng Wang
    Wenge Rong
    Jianfei Zhang
    Shijie Zhou
    Zhang Xiong
    Complex & Intelligent Systems, 2020, 6 : 493 - 505
  • [25] Multi-turn dialogue comprehension from a topic-aware perspective
    Ma, Xinbei
    Xu, Yi
    Zhao, Hai
    Zhang, Zhuosheng
    NEUROCOMPUTING, 2024, 578
  • [26] Debiasing Counterfactual Context With Causal Inference for Multi-Turn Dialogue Reasoning
    Wang, Xu
    Zhang, Hainan
    Zhao, Shuai
    Chen, Hongshen
    Ding, Zhuoye
    Wan, Zhiguo
    Cheng, Bo
    Lan, Yanyan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 1125 - 1132
  • [27] Improve the Response Diversity of Multi-turn Dialogue System by Combining Knowledge
    Li, Zhengpeng
    Wu, Jiansheng
    Miao, Jiawei
    Yu, Xinmiao
    IAENG International Journal of Computer Science, 2022, 49 (03)
  • [28] Multi-turn Intent Determination for Goal-oriented Dialogue systems
    Abro, Waheed Ahmed
    Qi, Guilin
    Gao, Huan
    Khan, Muhammad Asif
    Ali, Zafar
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [29] Research on Models for Multi-turn Task-oriented Dialogue Systems
    Qiu, Jie
    Wang, Peng
    Gou, Jianguo
    Qiu, Junying
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 5439 - 5444
  • [30] Deep context modeling for multi-turn response selection in dialogue systems
    Li, Lu
    Li, Chenliang
    Ji, Donghong
    INFORMATION PROCESSING & MANAGEMENT, 2021, 58 (01)