Building Adaptive Dialogue Systems Via Bayes-Adaptive POMDPs

被引:7
|
作者
Png, Shaowei [1 ]
Pineau, Joelle [1 ]
Chaib-draa, Brahim [2 ]
机构
[1] McGill Univ, Sch Comp Sci, Montreal, PQ H3A 2A7, Canada
[2] Univ Laval, Dept Comp Sci, Quebec City, PQ G1V 0A6, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Dialogue management; reinforcement learning; Markov decision process (MDP); partially observable Markov decision process (POMDP); Bayesian inference; MARKOV-PROCESSES; ALGORITHMS; MODEL;
D O I
10.1109/JSTSP.2012.2229962
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recent research has shown that effective dialogue management can be achieved through the Partially Observable Markov Decision Process (POMDP) framework. However past research on POMDP-based dialogue systems usually assumed the parameters of the decision process were known a priori. Themain contribution of this paper is to present a Bayesian reinforcement learning framework for learning the POMDP parameters online from data, in a decision-theoretic manner. We discuss various approximations and assumptions which can be leveraged to ensure computational tractability, and apply these techniques to learning observationmodels for several simulated spoken dialogue domains.
引用
收藏
页码:917 / 927
页数:11
相关论文
共 50 条
  • [1] Expectation-maximization for Bayes-adaptive POMDPs
    Vargo, Erik P.
    Cogill, Randy
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2015, 66 (10) : 1605 - 1623
  • [2] Bayes-adaptive hierarchical MDPs
    Ngo Anh Vien
    Lee, SeungGwan
    Chung, TaeChoong
    APPLIED INTELLIGENCE, 2016, 45 (01) : 112 - 126
  • [3] Bayes-adaptive hierarchical MDPs
    Ngo Anh Vien
    SeungGwan Lee
    TaeChoong Chung
    Applied Intelligence, 2016, 45 : 112 - 126
  • [4] ContraBAR: Contrastive Bayes-Adaptive Deep RL
    Choshen, Era
    Tamar, Aviv
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [5] Patient-Type Bayes-Adaptive Treatment Plans
    Skandari, M. Reza
    Shechter, Steven M.
    OPERATIONS RESEARCH, 2021, 69 (02) : 574 - 598
  • [6] Risk-Averse Bayes-Adaptive Reinforcement Learning
    Rigter, Marc
    Lacerda, Bruno
    Hawes, Nick
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Bayes-Adaptive Simulation-based Search with Value Function Approximation
    Guez, Arthur
    Heess, Nicolas
    Silver, David
    Dayan, Peter
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [8] Bayes-Adaptive Planning for Data-Efficient Verification of Uncertain Markov Decision Processes
    Wijesuriya, Viraj Brian
    Abate, Alessandro
    QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2019), 2019, 11785 : 91 - 108
  • [9] Bayes-Adaptive Monte-Carlo Planning and Learning for Goal-Oriented Dialogues
    Jung, Youngsoo
    Lee, Jongmin
    Kim, Kee-Eung
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7994 - 8001
  • [10] Middleware for building adaptive systems via configuration
    Narain, S
    Moyer, S
    Stephens, W
    Parmeswaran, K
    Shareef, AR
    ACM SIGPLAN NOTICES, 2001, 36 (08) : 188 - 195