Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework

被引:0
|
作者
Jin, Wanxin [1 ]
Wang, Zhaoran [2 ]
Yang, Zhuoran [3 ]
Mou, Shaoshuai [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
[2] Northwestern Univ, Evanston, IL 60208 USA
[3] Princeton Univ, Princeton, NJ 08544 USA
关键词
MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper develops a Pontryagin Differentiable Programming (PDP) methodology, which establishes a unified framework to solve a broad class of learning and control tasks. The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory with respect to tunable parameters within an optimal control system, enabling end-to-end learning of dynamics, policies, or/and control objective functions; and second, we propose an auxiliary control system in the backward pass of the PDP framework, and the output of this auxiliary control system is the analytical derivative of the original system's trajectory with respect to the parameters, which can be iteratively solved using standard control tools. We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning. We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered landing.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] End-to-End Deep Learning Proactive Content Caching Framework
    Bakr, Eslam Mohamed
    Ben-Ammar, Hamza
    Eraqi, Hesham M.
    Aly, Sherif G.
    Elbatt, Tamer
    Ghamri-Doudane, Yacine
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 1043 - 1048
  • [32] End-to-end offline reinforcement learning for glycemia control
    Beolet, Tristan
    Adenis, Alice
    Huneker, Erik
    Louis, Maxime
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 154
  • [33] End-to-End Deep Reinforcement Learning for Exoskeleton Control
    Rose, Lowell
    Bazzocchi, Michael C. F.
    Nejat, Goldie
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4294 - 4301
  • [34] Affordance Learning for End-to-End Visuomotor Robot Control
    Hamalainen, Aleksi
    Arndt, Karol
    Ghadirzadeh, Ali
    Kyrki, Ville
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 1781 - 1788
  • [35] A Fuzzy Admission Control Strategy for End-to-End QoS Framework
    Yu, Gang
    2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 273 - 275
  • [36] DRUM: End-To-End Differentiable Rule Mining On Knowledge Graphs
    Sadeghian, Ali
    Armandpour, Mohammadreza
    Ding, Patrick
    Wang, Daisy Zhe
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [37] Safe Pontryagin Differentiable Programming
    Jin, Wanxin
    Mou, Shaoshuai
    Pappas, George J.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [38] End-to-end differentiable construction of molecular mechanics force fields
    Wang, Yuanqing
    Fass, Josh
    Kaminow, Benjamin
    Herr, John E.
    Rufa, Dominic
    Zhang, Ivy
    Pulido, Ivan
    Henry, Mike
    Macdonald, Hannah E. Bruce
    Takaba, Kenichiro
    Chodera, John D.
    CHEMICAL SCIENCE, 2022, 13 (41) : 12016 - 12033
  • [39] End-to-end automatic lens design with a differentiable diffraction model
    Zhang, Wenguan
    Ren, Zheng
    Hou, Jingwen
    Hen, Shiqi
    Feng, Huajun
    Li, Q., I
    Xu, Zhihai
    Chen, Yueting
    OPTICS EXPRESS, 2024, 32 (25): : 44328 - 44345
  • [40] End-to-End Complex Lens Design with Differentiable Ray Tracing
    Sun, Qilin
    Wang, Congli
    Fu, Qiang
    Dun, Xiong
    Heidrich, Wolfgang
    ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (04):