Optimizing Quantiles in Preference-Based Markov Decision Processes

被引:0
|
作者
Gilbert, Hugo [1 ]
Weng, Paul [2 ,3 ,4 ]
Xu, Yan [2 ,3 ,4 ]
机构
[1] UPMC Univ Paris 06, Sorbonne Univ, CNRS, LIP6,UMR 7606, Paris, France
[2] SYSU CMU Joint Inst Engn, Guangzhou, Guangdong, Peoples R China
[3] Sch Elect & Informat Technol, Guangzhou, Guangdong, Peoples R China
[4] SYSU CMU Shunde Int Joint Res Inst, Shunde, Peoples R China
关键词
MINIMIZING RISK MODELS; VARIANCE; UTILITY; POLICY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the Markov decision process model, policies are usually evaluated by expected cumulative rewards. As this decision criterion is not always suitable, we propose in this paper an algorithm for computing a policy optimal for the quantile criterion. Both finite and infinite horizons are considered. Finally we experimentally evaluate our approach on random MDPs and on a data center control problem.
引用
收藏
页码:3569 / 3575
页数:7
相关论文
共 50 条
  • [31] Preference-based MPC calibration
    Zhu, Mengjia
    Bemporad, Alberto
    Piga, Dario
    2021 EUROPEAN CONTROL CONFERENCE (ECC), 2021, : 638 - 645
  • [32] Preference-based learning to rank
    Nir Ailon
    Mehryar Mohri
    Machine Learning, 2010, 80 : 189 - 211
  • [33] Preference-Based Policy Learning
    Akrour, Riad
    Schoenauer, Marc
    Sebag, Michele
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2011, 6911 : 12 - 27
  • [34] Preference-Based Privacy Markets
    Pal, Ranjan
    Crowcroft, Jon
    Wang, Yixuan
    Li, Yong
    De, Swades
    Tarkoma, Sasu
    Liu, Mingyan
    Nag, Bodhibrata
    Kumar, Abhishek
    Hui, Pan
    IEEE ACCESS, 2020, 8 (08): : 146006 - 146026
  • [35] The impossibility of a preference-based powerindex
    Braham, M
    Holler, MJ
    JOURNAL OF THEORETICAL POLITICS, 2005, 17 (01) : 137 - 157
  • [36] Preference-based care and research
    Jaarsma, Tiny
    Klompstra, Leonie
    Ski, Chantal F.
    Thompson, David R.
    EUROPEAN JOURNAL OF CARDIOVASCULAR NURSING, 2018, 17 (01) : 4 - 5
  • [37] Preference-based arguments for probabilism
    Christensen, D
    PHILOSOPHY OF SCIENCE, 2001, 68 (03) : 356 - 376
  • [38] Preference-Based Image Generation
    Kazemi, Hadi
    Taherkhani, Fariborz
    Nasrabadi, Nasser M.
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 3393 - 3402
  • [39] Preference-based learning to rank
    Ailon, Nir
    Mohri, Mehryar
    MACHINE LEARNING, 2010, 80 (2-3) : 189 - 211
  • [40] Preference-based recommender system
    Satzger, Benjamin
    Endres, Markus
    Kiessling, Werner
    E-COMMERCE AND WEB TECHNOLOGIES, PROCEEDINGS, 2006, 4082 : 31 - 40