Multi-objective Reinforcement Learning with Path Integral Policy Improvement

被引:0
|
作者
Ariizumi, Ryo [1 ]
Sago, Hayato [2 ]
Asai, Toru [2 ]
Azuma, Shun-ichi
机构
[1] Tokyo Univ Agr & Technol, Dept Mech Syst Engn, Tokyo, Japan
[2] Nagoya Univ, Grad Sch Engn, Nagoya, Japan
关键词
Multi-objective reinforcement learning; policy improvement;
D O I
10.23919/SICE59929.2023.10354223
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-objective reinforcement learning (MORL) for robot motion learning is a challenging problem not only because of the scarcity of the data but also of the high-dimensional and continuous state and action spaces. Most existing MORL algorithms are inadequate in this regard. However, in single-objective reinforcement learning, policy-based algorithms have solved the problem of high-dimensional and continuous state and action spaces. Among such algorithms is policy improvement with path integral (PI2), which has been successful in robot motion learning. PI2 is similar to evolution strategies (ES), and multi-objective optimization is a hot topic in ES algorithms. This paper proposes a MORL algorithm based on PI2 and multi-objective ES, which can handle the problem related to robot motion learning. The effectiveness is shown via numerical simulations.
引用
收藏
页码:1418 / 1423
页数:6
相关论文
共 50 条
  • [1] Multi-objective path planning based on deep reinforcement learning
    Xu, Jian
    Huang, Fei
    Cui, Yunfei
    Du, Xue
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 3273 - 3279
  • [2] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Horie, Naoto
    Matsui, Tohgoroh
    Moriyama, Koichi
    Mutoh, Atsuko
    Inuzuka, Nobuhiro
    ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (03) : 352 - 359
  • [3] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Naoto Horie
    Tohgoroh Matsui
    Koichi Moriyama
    Atsuko Mutoh
    Nobuhiro Inuzuka
    Artificial Life and Robotics, 2019, 24 : 352 - 359
  • [4] Knowledge Transfer in Multi-Objective Multi-Agent Reinforcement Learning via Generalized Policy Improvement
    de Almeida, Vicente N.
    Alegre, Lucas N.
    Bazzan, Ana L. C.
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2024, 21 (01) : 335 - 362
  • [5] Nondominated Policy-Guided Learning in Multi-Objective Reinforcement Learning
    Kim, Man-Je
    Park, Hyunsoo
    Ahn, Chang Wook
    ELECTRONICS, 2022, 11 (07)
  • [6] Neuroevolutionary diversity policy search for multi-objective reinforcement learning
    Zhou, Dan
    Du, Jiqing
    Arai, Sachiyo
    INFORMATION SCIENCES, 2024, 657
  • [7] A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation
    Yang, Runzhe
    Sun, Xingyuan
    Narasimhan, Karthik
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [8] Policy invariance under reward transformations for multi-objective reinforcement learning
    Mannion, Patrick
    Devlin, Sam
    Mason, Karl
    Duggan, Jim
    Howley, Enda
    NEUROCOMPUTING, 2017, 263 : 60 - 73
  • [9] Safety Optimized Reinforcement Learning via Multi-Objective Policy Optimization
    Honari, Homayoun
    Tamizi, Mehran Ghafarian
    Najjaran, Homayoun
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 2873 - 2879
  • [10] Local-utopia Policy Selection for Multi-objective Reinforcement Learning
    Parisi, Simone
    Blank, Alexander
    Viernicke, Tobias
    Peters, Jan
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,