Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework

被引：0

作者：

Jin, Wanxin ^{[1
]}

Wang, Zhaoran ^{[2
]}

Yang, Zhuoran ^{[3
]}

Mou, Shaoshuai ^{[1
]}

机构：

[1] Purdue Univ, W Lafayette, IN 47907 USA

[2] Northwestern Univ, Evanston, IL 60208 USA

[3] Princeton Univ, Princeton, NJ 08544 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷

关键词：

MODEL;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper develops a Pontryagin Differentiable Programming (PDP) methodology, which establishes a unified framework to solve a broad class of learning and control tasks. The PDP distinguishes from existing methods by two novel techniques: first, we differentiate through Pontryagin's Maximum Principle, and this allows to obtain the analytical derivative of a trajectory with respect to tunable parameters within an optimal control system, enabling end-to-end learning of dynamics, policies, or/and control objective functions; and second, we propose an auxiliary control system in the backward pass of the PDP framework, and the output of this auxiliary control system is the analytical derivative of the original system's trajectory with respect to the parameters, which can be iteratively solved using standard control tools. We investigate three learning modes of the PDP: inverse reinforcement learning, system identification, and control/planning. We demonstrate the capability of the PDP in each learning mode on different high-dimensional systems, including multi-link robot arm, 6-DoF maneuvering quadrotor, and 6-DoF rocket powered landing.

引用

页数：14

共 50 条

[31] End-to-End Deep Learning Proactive Content Caching Framework
Bakr, Eslam Mohamed
Ben-Ammar, Hamza
Eraqi, Hesham M.
Aly, Sherif G.
Elbatt, Tamer
Ghamri-Doudane, Yacine
2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 1043 - 1048
[32] End-to-end offline reinforcement learning for glycemia control
Beolet, Tristan
Adenis, Alice
Huneker, Erik
Louis, Maxime
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 154
[33] End-to-End Deep Reinforcement Learning for Exoskeleton Control
Rose, Lowell
Bazzocchi, Michael C. F.
Nejat, Goldie
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4294 - 4301
[34] Affordance Learning for End-to-End Visuomotor Robot Control
Hamalainen, Aleksi
Arndt, Karol
Ghadirzadeh, Ali
Kyrki, Ville
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 1781 - 1788
[35] A Fuzzy Admission Control Strategy for End-to-End QoS Framework
Yu, Gang
2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 273 - 275
[36] DRUM: End-To-End Differentiable Rule Mining On Knowledge Graphs
Sadeghian, Ali
Armandpour, Mohammadreza
Ding, Patrick
Wang, Daisy Zhe
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[37] Safe Pontryagin Differentiable Programming
Jin, Wanxin
Mou, Shaoshuai
Pappas, George J.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[38] End-to-end differentiable construction of molecular mechanics force fields
Wang, Yuanqing
Fass, Josh
Kaminow, Benjamin
Herr, John E.
Rufa, Dominic
Zhang, Ivy
Pulido, Ivan
Henry, Mike
Macdonald, Hannah E. Bruce
Takaba, Kenichiro
Chodera, John D.
CHEMICAL SCIENCE, 2022, 13 (41) : 12016 - 12033
[39] End-to-end automatic lens design with a differentiable diffraction model
Zhang, Wenguan
Ren, Zheng
Hou, Jingwen
Hen, Shiqi
Feng, Huajun
Li, Q., I
Xu, Zhihai
Chen, Yueting
OPTICS EXPRESS, 2024, 32 (25): : 44328 - 44345
[40] End-to-End Complex Lens Design with Differentiable Ray Tracing
Sun, Qilin
Wang, Congli
Fu, Qiang
Dun, Xiong
Heidrich, Wolfgang
ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (04):

← 1 2 3 4 5 →