Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions

被引:0
|
作者
Chen, Xiong-Hui [1 ]
Luo, Fan-Ming [1 ]
Yu, Yang [1 ]
Li, Qingyang [2 ]
Qin, Zhiwei [2 ]
Shang, Wenjie [3 ]
Ye, Jieping [3 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
[2] DiDi Labs, Mountain View, CA 94043 USA
[3] DiDi Chuxing, Beijing 300450, Peoples R China
基金
美国国家科学基金会;
关键词
Adaptation models; Uncertainty; Predictive models; Behavioral sciences; Extrapolation; Trajectory; Reinforcement learning; Adaptable policy learning; meta learning; model-based reinforcement learning; offline reinforcement learning;
D O I
10.1109/TPAMI.2023.3317131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In reinforcement learning, a promising direction to avoid online trial-and-error costs is learning from an offline dataset. Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies. Such constraints, however, also limit the potential of the outcome policies. In this paper, to release the potential of offline policy learning, we investigate the decision-making problems in out-of-support regions directly and propose offline Model-based Adaptable Policy LEarning (MAPLE). By this approach, instead of learning in in-support regions, we learn an adaptable policy that can adapt its behavior in out-of-support regions when deployed. We give a practical implementation of MAPLE via meta-learning techniques and ensemble model learning techniques. We conduct experiments on MuJoCo locomotion tasks with offline datasets. The results show that the proposed method can make robust decisions in out-of-support regions and achieve better performance than SOTA algorithms.
引用
收藏
页码:15260 / 15274
页数:15
相关论文
共 50 条
  • [21] MOPO: Model-based Offline Policy Optimization
    Yu, Tianhe
    Thomas, Garrett
    Yu, Lantao
    Ermon, Stefano
    Zou, James
    Levine, Sergey
    Finn, Chelsea
    Ma, Tengyu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [22] Adaptable distance-based decision-making support in dynamic cross-grid environment
    Gossa, Julien
    Pierson, Jean-Marc
    Brunie, Lionel
    EURO-PAR 2007 PARALLEL PROCESSING, PROCEEDINGS, 2007, 4641 : 437 - +
  • [23] Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
    Yang, Shentao
    Feng, Yihao
    Zhang, Shujian
    Zhou, Mingyuan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [24] ADAPTIVE MODEL TO SUPPORT DECISION-MAKING IN LANGUAGE E-LEARNING
    Bradac, Vladimir
    EDULEARN13: 5TH INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES, 2013, : 4036 - 4045
  • [25] A Guide to an Iterative Approach to Model-Based Decision Making in Health and Medicine: An Iterative Decision-Making Framework
    Kunst, Natalia
    Burger, Emily A.
    Coupe, Veerle M. H.
    Kuntz, Karen M.
    Aas, Eline
    PHARMACOECONOMICS, 2024, 42 (04) : 363 - 371
  • [26] A Guide to an Iterative Approach to Model-Based Decision Making in Health and Medicine: An Iterative Decision-Making Framework
    Natalia Kunst
    Emily A. Burger
    Veerle M. H. Coupé
    Karen M. Kuntz
    Eline Aas
    PharmacoEconomics, 2024, 42 : 363 - 371
  • [27] Optimization of Data Collection Strategies for Model-Based Evaluation and Decision-Making
    Cain, Robert
    van Moorsel, Aad
    2012 42ND ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2012,
  • [28] From model-based perceptual decision-making to spatial interference control
    van Maanen, Leendert
    Turner, Brandon
    Forstmann, Birte U.
    CURRENT OPINION IN BEHAVIORAL SCIENCES, 2015, 1 : 72 - 77
  • [29] MODEL-BASED METHOD FOR COMPUTER-AIDED MEDICAL DECISION-MAKING
    WEISS, SM
    KULIKOWSKI, CA
    AMAREL, S
    SAFIR, A
    ARTIFICIAL INTELLIGENCE, 1978, 11 (1-2) : 145 - 172
  • [30] Symbolic Reasoning for Early Decision-Making in Model-Based Systems Engineering
    Cederbladh, Johan
    Cleophas, Loek
    Kamburjan, Eduard
    Lima, Lucas
    Vangheluwe, Hans
    2023 ACM/IEEE INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS COMPANION, MODELS-C, 2023, : 721 - 725