Distributional Pareto-Optimal Multi-Objective Reinforcement Learning

被引:0
|
作者
Cai, Xin-Qiang [1 ,2 ]
Zhang, Pushi [2 ]
Zhao, Li [2 ]
Bian, Jiang [2 ]
Sugiyama, Masashi [1 ,3 ]
Llorens, Ashley J. [2 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] RIKEN AIP, Tokyo, Japan
关键词
STOCHASTIC-DOMINANCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-objective reinforcement learning (MORL) has been proposed to learn control policies over multiple competing objectives with each possible preference over returns. However, current MORL algorithms fail to account for distributional preferences over the multi-variate returns, which are particularly important in realworld scenarios such as autonomous driving. To address this issue, we extend the concept of Pareto-optimality in MORL into distributional Pareto-optimality, which captures the optimality of return distributions, rather than the expectations. Our proposed method, called Distributional Pareto-Optimal Multi-Objective Reinforcement Learning (DPMORL), is capable of learning distributional Pareto-optimal policies that balance multiple objectives while considering the return uncertainty. We evaluated our method on several benchmark problems and demonstrated its effectiveness in discovering distributional Pareto-optimal policies and satisfying diverse distributional preferences compared to existing MORL methods.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Pareto-Optimal Multi-objective Inversion of Geophysical Data
    Sebastian Schnaidt
    Dennis Conway
    Lars Krieger
    Graham Heinson
    Pure and Applied Geophysics, 2018, 175 : 2221 - 2236
  • [2] Pareto-Optimal Multi-objective Inversion of Geophysical Data
    Schnaidt, Sebastian
    Conway, Dennis
    Krieger, Lars
    Heinson, Graham
    PURE AND APPLIED GEOPHYSICS, 2018, 175 (06) : 2221 - 2236
  • [3] Pareto-optimal sampling for multi-objective protein sequence design
    Luo, Jiaqi
    Ding, Kerr
    Luo, Yunan
    ISCIENCE, 2025, 28 (03)
  • [4] Searching for robust Pareto-optimal solutions in multi-objective optimization
    Deb, K
    Gupta, H
    EVOLUTIONARY MULTI-CRITERION OPTIMIZATION, 2005, 3410 : 150 - 164
  • [5] A multi-objective evolutionary approach to Pareto-optimal model trees
    Czajkowski, Marcin
    Kretowski, Marek
    SOFT COMPUTING, 2019, 23 (05) : 1423 - 1437
  • [6] A Pareto-optimal genetic algorithm for warehouse multi-objective optimization
    Poulos, PN
    Rigatos, GG
    Tzafestas, SG
    Koukos, AK
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2001, 14 (06) : 737 - 749
  • [7] A multi-objective evolutionary approach to Pareto-optimal model trees
    Marcin Czajkowski
    Marek Kretowski
    Soft Computing, 2019, 23 : 1423 - 1437
  • [8] Pareto-optimal solutions in fuzzy multi-objective linear programming
    Jimenez, Mariano
    Bilbao, Amelia
    FUZZY SETS AND SYSTEMS, 2009, 160 (18) : 2714 - 2721
  • [9] Pareto-optimal solutions for multi-objective production scheduling problems
    Bagchi, TP
    EVOLUTIONARY MULTI-CRITERION OPTIMIZATION, PROCEEDINGS, 2001, 1993 : 458 - 471
  • [10] Multi-objective Pareto-optimal control: an application to wastewater management
    L. J. Alvarez-Vázquez
    N. García-Chan
    A. Martínez
    M. E. Vázquez-Méndez
    Computational Optimization and Applications, 2010, 46 : 135 - 157