Building Adaptive Dialogue Systems Via Bayes-Adaptive POMDPs

被引：7

作者：

Png, Shaowei ^{[1
]}

Pineau, Joelle ^{[1
]}

Chaib-draa, Brahim ^{[2
]}

机构：

[1] McGill Univ, Sch Comp Sci, Montreal, PQ H3A 2A7, Canada

[2] Univ Laval, Dept Comp Sci, Quebec City, PQ G1V 0A6, Canada

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2012年 / 6卷 / 08期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Dialogue management; reinforcement learning; Markov decision process (MDP); partially observable Markov decision process (POMDP); Bayesian inference; MARKOV-PROCESSES; ALGORITHMS; MODEL;

D O I：

10.1109/JSTSP.2012.2229962

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recent research has shown that effective dialogue management can be achieved through the Partially Observable Markov Decision Process (POMDP) framework. However past research on POMDP-based dialogue systems usually assumed the parameters of the decision process were known a priori. Themain contribution of this paper is to present a Bayesian reinforcement learning framework for learning the POMDP parameters online from data, in a decision-theoretic manner. We discuss various approximations and assumptions which can be leveraged to ensure computational tractability, and apply these techniques to learning observationmodels for several simulated spoken dialogue domains.

引用

页码：917 / 927

页数：11

共 50 条

[1] Expectation-maximization for Bayes-adaptive POMDPs
Vargo, Erik P.
Cogill, Randy
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2015, 66 (10) : 1605 - 1623
[2] Bayes-adaptive hierarchical MDPs
Ngo Anh Vien
Lee, SeungGwan
Chung, TaeChoong
APPLIED INTELLIGENCE, 2016, 45 (01) : 112 - 126
[3] Bayes-adaptive hierarchical MDPs
Ngo Anh Vien
SeungGwan Lee
TaeChoong Chung
Applied Intelligence, 2016, 45 : 112 - 126
[4] ContraBAR: Contrastive Bayes-Adaptive Deep RL
Choshen, Era
Tamar, Aviv
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[5] Patient-Type Bayes-Adaptive Treatment Plans
Skandari, M. Reza
Shechter, Steven M.
OPERATIONS RESEARCH, 2021, 69 (02) : 574 - 598
[6] Risk-Averse Bayes-Adaptive Reinforcement Learning
Rigter, Marc
Lacerda, Bruno
Hawes, Nick
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[7] Bayes-Adaptive Simulation-based Search with Value Function Approximation
Guez, Arthur
Heess, Nicolas
Silver, David
Dayan, Peter
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
[8] Bayes-Adaptive Planning for Data-Efficient Verification of Uncertain Markov Decision Processes
Wijesuriya, Viraj Brian
Abate, Alessandro
QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2019), 2019, 11785 : 91 - 108
[9] Bayes-Adaptive Monte-Carlo Planning and Learning for Goal-Oriented Dialogues
Jung, Youngsoo
Lee, Jongmin
Kim, Kee-Eung
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7994 - 8001
[10] Middleware for building adaptive systems via configuration
Narain, S
Moyer, S
Stephens, W
Parmeswaran, K
Shareef, AR
ACM SIGPLAN NOTICES, 2001, 36 (08) : 188 - 195

← 1 2 3 4 5 →