Optimizing Quantiles in Preference-Based Markov Decision Processes

被引：0

作者：

Gilbert, Hugo ^{[1
]}

Weng, Paul ^{[2
,3
,4
]}

Xu, Yan ^{[2
,3
,4
]}

机构：

[1] UPMC Univ Paris 06, Sorbonne Univ, CNRS, LIP6,UMR 7606, Paris, France

[2] SYSU CMU Joint Inst Engn, Guangzhou, Guangdong, Peoples R China

[3] Sch Elect & Informat Technol, Guangzhou, Guangdong, Peoples R China

[4] SYSU CMU Shunde Int Joint Res Inst, Shunde, Peoples R China

来源：

THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2017年

关键词：

MINIMIZING RISK MODELS; VARIANCE; UTILITY; POLICY;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the Markov decision process model, policies are usually evaluated by expected cumulative rewards. As this decision criterion is not always suitable, we propose in this paper an algorithm for computing a policy optimal for the quantile criterion. Both finite and infinite horizons are considered. Finally we experimentally evaluate our approach on random MDPs and on a data center control problem.

引用

页码：3569 / 3575

页数：7

共 50 条

[31] Preference-based MPC calibration
Zhu, Mengjia
Bemporad, Alberto
Piga, Dario
2021 EUROPEAN CONTROL CONFERENCE (ECC), 2021, : 638 - 645
[32] Preference-based learning to rank
Nir Ailon
Mehryar Mohri
Machine Learning, 2010, 80 : 189 - 211
[33] Preference-Based Policy Learning
Akrour, Riad
Schoenauer, Marc
Sebag, Michele
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 12 - 27
[34] Preference-Based Privacy Markets
Pal, Ranjan
Crowcroft, Jon
Wang, Yixuan
Li, Yong
De, Swades
Tarkoma, Sasu
Liu, Mingyan
Nag, Bodhibrata
Kumar, Abhishek
Hui, Pan
IEEE ACCESS, 2020, 8 (08): : 146006 - 146026
[35] The impossibility of a preference-based powerindex
Braham, M
Holler, MJ
JOURNAL OF THEORETICAL POLITICS, 2005, 17 (01) : 137 - 157
[36] Preference-based care and research
Jaarsma, Tiny
Klompstra, Leonie
Ski, Chantal F.
Thompson, David R.
EUROPEAN JOURNAL OF CARDIOVASCULAR NURSING, 2018, 17 (01) : 4 - 5
[37] Preference-based arguments for probabilism
Christensen, D
PHILOSOPHY OF SCIENCE, 2001, 68 (03) : 356 - 376
[38] Preference-Based Image Generation
Kazemi, Hadi
Taherkhani, Fariborz
Nasrabadi, Nasser M.
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 3393 - 3402
[39] Preference-based learning to rank
Ailon, Nir
Mohri, Mehryar
MACHINE LEARNING, 2010, 80 (2-3) : 189 - 211
[40] Preference-based recommender system
Satzger, Benjamin
Endres, Markus
Kiessling, Werner
E-COMMERCE AND WEB TECHNOLOGIES, PROCEEDINGS, 2006, 4082 : 31 - 40

← 1 2 3 4 5 →