A multi-agent curiosity reward model for task-oriented dialogue systems

被引:1
|
作者
Sun, Jingtao [1 ,2 ]
Kou, Jiayin [1 ,2 ]
Hou, Wenyan [1 ,2 ]
Bai, Yujei [1 ,2 ]
机构
[1] Xian Univ Posts & Telecommun, Sch Comp Sci & Technol, Xian 710121, Shaanxi, Peoples R China
[2] Xian Univ Posts & Telecommun, Shaanxi Key Lab Network Data Anal & Intelligent Pr, Xian, Peoples R China
关键词
Task-oriented dialogue systems; Reinforcement learning; Curiosity rewards; Exploration and exploitation;
D O I
10.1016/j.patcog.2024.110884
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In practical decision-making dialogues, reinforcement learning methods face hurdles due to delays and sparse reward feedback for agents, and in some cases, lack of rewards altogether. These issues can impede efficient learning of dialogue strategies and compromise the performance of the model. To address this challenge, this paper introduces the Multi-Agent Curiosity Reward Model (MACRM) for task-oriented dialog systems. Firstly, in terms of dialog reward mechanisms, a forward dynamics model generates curiosity rewards, which are integrated with extrinsic rewards from the dialog environment feedback to mitigate the problem of sparse rewards resulting from inadequate agent exploration. Secondly, regarding the dialogue strategy training mechanism, an exploration-exploitation approach inspired by organismic exploration is adopted. This approach involves fully exploring the dialogue environment in the early stages and optimally exploiting learned knowledge later, thereby balancing exploration and exploitation and enhancing dialogue strategy learning efficiency. To assess the proposed model's effectiveness, experiments are conducted using the MultiWOZ corpus across three reward environments: (1) extrinsic rewards only, (2) curiosity rewards only, and (3) a combination of both. The experimental results demonstrate that agents employing MACRM exhibit faster learning of dialogue strategies compared to those relying on a single exploratory reward method, effectively addressing reward sparsity and delay issues in practical decision-making scenarios.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A multi-agent collaborative algorithm for task-oriented dialogue systems
    Sun, Jingtao
    Kou, Jiayin
    Shi, Weipeng
    Hou, Wenyan
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (03) : 2009 - 2022
  • [2] Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition
    Takanobu, Ryuichi
    Liang, Runze
    Huang, Minlie
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 625 - 638
  • [3] Deep Reinforcement Learning Based Task-Oriented Communication in Multi-Agent Systems
    He, Guojun
    Feng, Mingjie
    Zhang, Yu
    Liu, Guanghua
    Dai, Yueyue
    Jiang, Tao
    IEEE WIRELESS COMMUNICATIONS, 2023, 30 (03) : 112 - 119
  • [4] Multi-goal multi-agent learning for task-oriented dialogue with bidirectional teacher-student learning
    He, Wanwei
    Sun, Yang
    Yang, Min
    Ji, Feng
    Li, Chengming
    Xu, Ruifeng
    KNOWLEDGE-BASED SYSTEMS, 2021, 213
  • [5] A Survey on Task-Oriented Dialogue Systems
    Zhao Y.-Y.
    Wang Z.-Y.
    Wang P.
    Yang T.
    Zhang R.
    Yin K.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (10): : 1862 - 1896
  • [6] Learning Task-Oriented Channel Allocation for Multi-Agent Communication
    He, Guojun
    Cui, Shibo
    Dai, Yueyue
    Jiang, Tao
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (11) : 12016 - 12029
  • [7] Abstract Architecture for Task-oriented Multi-agent Problem Solving
    Vokrinek, Jiri
    Komenda, Antonin
    Pechoucek, Michal
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2011, 41 (01): : 31 - 40
  • [8] A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
    Liang, Songfeng
    Xu, Kai
    Dong, Zhurong
    IEEE ACCESS, 2025, 13 : 11754 - 11764
  • [9] A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
    Liang, Songfeng
    Xu, Kai
    Dong, Zhurong
    IEEE ACCESS, 2025, 13 : 11754 - 11764
  • [10] Research on Models for Multi-turn Task-oriented Dialogue Systems
    Qiu, Jie
    Wang, Peng
    Gou, Jianguo
    Qiu, Junying
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 5439 - 5444