Generalization and Computation for Policy Classes of Generative Adversarial Imitation Learning

被引：3

作者：

Zhou, Yirui ^{[1
]}

Zhang, Yangchun ^{[1
]}

Liu, Xiaowei ^{[1
]}

Wang, Wanying ^{[1
]}

Che, Zhengping ^{[2
]}

Xu, Zhiyuan ^{[2
]}

Tang, Jian ^{[2
]}

Peng, Yaxin ^{[1
]}

机构：

[1] Shanghai Univ, Sch Sci, Dept Math, Shanghai 200444, Peoples R China

[2] Midea Grp, AI Innovat Ctr, Shanghai 201702, Peoples R China

来源：

PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XVII, PPSN 2022, PT I | 2022年 / 13398卷

关键词：

Generative adversarial imitation learning; Generalization; Computation; Policy classes;

D O I：

10.1007/978-3-031-14714-2_27

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generative adversarial imitation learning (GAIL) learns an optimal policy by expert demonstrations from the environment with unknown reward functions. Different from existing works that studied the generalization of reward function classes or discriminator classes, we focus on policy classes. This paper investigates the generalization and computation for policy classes of GAIL. Specifically, our contributions lie in: 1) We prove that the generalization is guaranteed in GAIL when the complexity of policy classes is properly controlled. 2) We provide an off-policy framework called the two-stage stochastic gradient (TSSG), which can efficiently solve GAIL based on the soft policy iteration and attain the sublinear convergence rate to a stationary solution. The comprehensive numerical simulations are illustrated in MuJoCo environments.

引用

页码：385 / 399

页数：15

共 50 条

[31] Sample-Efficient Imitation Learning via Generative Adversarial Nets
Blonde, Lionel
Kalousis, Alexandros
22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[32] Generative Adversarial Imitation Learning from Human Behavior with Reward Shaping
Li, Jiangeng
Huang, Shuai
Xu, Xin
Zuo, Guoyu
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 6254 - 6259
[33] DIVINE: A Generative Adversarial Imitation Learning Framework for Knowledge Graph Reasoning
Li, Ruiping
Cheng, Xiang
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2642 - 2651
[34] Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective
Wang, Wanying
Zhu, Yichen
Zhou, Yirui
Shen, Chaomin
Tang, Jian
Xu, Zhiyuan
Peng, Yaxin
Zhang, Yangchun
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15625 - 15633
[35] Generative Adversarial Imitation Learning to Search in Branch-and-Bound Algorithms
Wang, Qi
Blackley, Suzanne, V
Tang, Chunlei
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT II, 2022, : 673 - 680
[36] Drone Navigation in Unreal Engine Using Generative Adversarial Imitation Learning
Bandela, Suraj
Cao, Yongcan
AIAA SCITECH 2023 FORUM, 2023,
[37] AugGAIL : Augmented generative adversarial imitation learning for robotic manipulation tasks
Jung E.
Lee S.
Kim I.
Journal of Institute of Control, Robotics and Systems, 2020, 26 (05) : 325 - 334
[38] When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence
Guan, Ziwei
Xu, Tengyu
Liang, Yingbin
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[39] Weight Adaptive Generative Adversarial Imitation Learning Based on Noise Contrastive Estimation
Guan, Weifan
Zhang, Xi
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (04): : 300 - 312
[40] Dynamic Economic Dispatch of Power System Based on Generative Adversarial Imitation Learning
Chen H.
Meng F.
Zhang Y.
Sun Y.
Zhang J.
Shan L.
Lü X.
Zhang P.
Dianwang Jishu/Power System Technology, 2022, 46 (11): : 4373 - 4380

← 1 2 3 4 5 →