A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition

被引:0
|
作者
Wang, Huimin [1 ]
Wong, Kam-Fai [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most reinforcement learning methods for dialog policy learning train a centralized agent that selects a predefined joint action concatenating domain name, intent type, and slot name. The centralized dialog agent suffers from a great many user-agent interaction requirements due to the large action space. Besides, designing the concatenated actions is laborious to engineers and maybe struggled with edge cases. To solve these problems, we model the dialog policy learning problem with a novel multi-agent framework, in which each part of the action is led by a different agent. The framework reduces labor costs for action templates and decreases the size of the action space for each agent. Furthermore, we relieve the non-stationary problem caused by the changing dynamics of the environment as evolving of agents' policies by introducing a joint optimization process that makes agents can exchange their policy information. Concurrently, an independent experience replay buffer mechanism is integrated to reduce the dependence between gradients of samples to improve training efficiency. The effectiveness of the proposed framework is demonstrated in a multi-domain environment with both user simulator evaluation and human evaluation.
引用
收藏
页码:7882 / 7889
页数:8
相关论文
共 50 条
  • [31] Constraint-based multi-agent reinforcement learning for collaborative tasks
    Shang, Xiumin
    Xu, Tengyu
    Karamouzas, Ioannis
    Kallmann, Marcelo
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2023, 34 (3-4)
  • [32] Collaborative Intelligent Reflecting Surface Networks With Multi-Agent Reinforcement Learning
    Zhang, Jie
    Li, Jun
    Zhang, Yijin
    Wu, Qingqing
    Wu, Xiongwei
    Shu, Feng
    Jin, Shi
    Chen, Wen
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (03) : 532 - 545
  • [33] Collaborative Intelligent Reflecting Surface Networks With Multi-Agent Reinforcement Learning
    Zhang, Jie
    Li, Jun
    Zhang, Yijin
    Wu, Qingqing
    Wu, Xiongwei
    Shu, Feng
    Jin, Shi
    Chen, Wen
    IEEE Journal on Selected Topics in Signal Processing, 2022, 16 (03): : 532 - 545
  • [34] Multi-agent Collaborative Fire Rescue Based on Deep Reinforcement Learning
    Feng, Yiming
    2022 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, BIG DATA AND ALGORITHMS (EEBDA), 2022, : 1317 - 1321
  • [35] Collaborative Multi-Agent Dialogue Model Training Via Reinforcement Learning
    Papangelis, Alexandros
    Wang, Yi-Chia
    Molino, Piero
    Tur, Gokhan
    20TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2019), 2019, : 92 - 102
  • [36] Coordinating Multi-Agent Reinforcement Learning via Dual Collaborative Constraints
    Li, Chao
    Dong, Shaokang
    Yang, Shangdong
    Hu, Yujing
    Li, Wenbin
    Gao, Yang
    NEURAL NETWORKS, 2025, 182
  • [37] Scaling Collaborative Space Networks with Deep Multi-Agent Reinforcement Learning
    Ma, Ricky
    Hernandez, Gabe
    Hernandez, Carrie
    2023 IEEE COGNITIVE COMMUNICATIONS FOR AEROSPACE APPLICATIONS WORKSHOP, CCAAW, 2023,
  • [38] Noise Distribution Decomposition Based Multi-Agent Distributional Reinforcement Learning
    Geng, Wei
    Xiao, Baidi
    Li, Rongpeng
    Wei, Ning
    Wang, Dong
    Zhao, Zhifeng
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2025, 24 (03) : 2301 - 2314
  • [39] MARRGM: Learning Framework for Multi-Agent Reinforcement Learning via Reinforcement Recommendation and Group Modification
    Wu, Peiliang
    Tian, Liqiang
    Zhang, Qian
    Mao, Bingyi
    Chen, Wenbai
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (06) : 5385 - 5392
  • [40] A unified framework for reinforcement learning, co-learning and meta-learning how to coordinate in collaborative multi-agent systems
    Tosic, Predrag T.
    Vilalta, Ricardo
    ICCS 2010 - INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, PROCEEDINGS, 2010, 1 (01): : 2211 - 2220