Distributional Pareto-Optimal Multi-Objective Reinforcement Learning

被引:0
|
作者
Cai, Xin-Qiang [1 ,2 ]
Zhang, Pushi [2 ]
Zhao, Li [2 ]
Bian, Jiang [2 ]
Sugiyama, Masashi [1 ,3 ]
Llorens, Ashley J. [2 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] RIKEN AIP, Tokyo, Japan
关键词
STOCHASTIC-DOMINANCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-objective reinforcement learning (MORL) has been proposed to learn control policies over multiple competing objectives with each possible preference over returns. However, current MORL algorithms fail to account for distributional preferences over the multi-variate returns, which are particularly important in realworld scenarios such as autonomous driving. To address this issue, we extend the concept of Pareto-optimality in MORL into distributional Pareto-optimality, which captures the optimality of return distributions, rather than the expectations. Our proposed method, called Distributional Pareto-Optimal Multi-Objective Reinforcement Learning (DPMORL), is capable of learning distributional Pareto-optimal policies that balance multiple objectives while considering the return uncertainty. We evaluated our method on several benchmark problems and demonstrated its effectiveness in discovering distributional Pareto-optimal policies and satisfying diverse distributional preferences compared to existing MORL methods.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Multi-Objective Bayesian Optimization for Design of Pareto-Optimal Current Drive Profiles in STEP
    Brown, Theodore
    Marsden, Stephen
    Gopakumar, Vignesh
    Terenin, Alexander
    Ge, Hong
    Casson, Francis
    IEEE TRANSACTIONS ON PLASMA SCIENCE, 2024, : 1 - 6
  • [32] Multi-Objective Reinforcement Learning using Sets of Pareto Dominating Policies
    Van Moffaert, Kristof
    Nowe, Ann
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 3483 - 3512
  • [33] Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation
    Parisi, Simone
    Pirotta, Matteo
    Restelli, Marcello
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2016, 57 : 187 - 227
  • [34] A multi-phase covering Pareto-optimal front method to multi-objective parallel machine scheduling
    Behnamian, J.
    Zandieh, M.
    Ghomi, S. M. T. Fatemi
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2010, 48 (17) : 4949 - 4976
  • [35] Determining All Pareto-Optimal Paths for Multi-category Multi-objective Path Optimization Problems
    Ma, Yiming
    Hu, Xiaobing
    Zhou, Hang
    ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 327 - 335
  • [36] Multi-Objective Optimization of Water-Sedimentation-Power in Reservoir Based on Pareto-Optimal Solution
    李辉
    练继建
    Transactions of Tianjin University, 2008, (04) : 282 - 288
  • [37] Pareto-Optimal Transit Route Planning With Multi-Objective Monte-Carlo Tree Search
    Weng, Di
    Chen, Ran
    Zhang, Jianhui
    Bao, Jie
    Zheng, Yu
    Wu, Yingcai
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (02) : 1185 - 1195
  • [38] Multi-objective optimization of water-sedimentation-power in reservoir based on pareto-optimal solution
    Li H.
    Lian J.
    Trans. Tianjin Univ., 2008, 4 (282-288): : 282 - 288
  • [39] Pareto-optimal solutions based multi-objective particle swarm optimization control for batch processes
    Li Jia
    Dashuai Cheng
    Min-Sen Chiu
    Neural Computing and Applications, 2012, 21 : 1107 - 1116
  • [40] Pareto-optimal multi-objective dimensionality reduction deep auto-encoder for mammography classification
    Taghanaki, Saeid Asgari
    Kawahara, Jeremy
    Miles, Brandon
    Hamarneh, Ghassan
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2017, 145 : 85 - 93