Generalization and Computation for Policy Classes of Generative Adversarial Imitation Learning

被引:3
|
作者
Zhou, Yirui [1 ]
Zhang, Yangchun [1 ]
Liu, Xiaowei [1 ]
Wang, Wanying [1 ]
Che, Zhengping [2 ]
Xu, Zhiyuan [2 ]
Tang, Jian [2 ]
Peng, Yaxin [1 ]
机构
[1] Shanghai Univ, Sch Sci, Dept Math, Shanghai 200444, Peoples R China
[2] Midea Grp, AI Innovat Ctr, Shanghai 201702, Peoples R China
关键词
Generative adversarial imitation learning; Generalization; Computation; Policy classes;
D O I
10.1007/978-3-031-14714-2_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative adversarial imitation learning (GAIL) learns an optimal policy by expert demonstrations from the environment with unknown reward functions. Different from existing works that studied the generalization of reward function classes or discriminator classes, we focus on policy classes. This paper investigates the generalization and computation for policy classes of GAIL. Specifically, our contributions lie in: 1) We prove that the generalization is guaranteed in GAIL when the complexity of policy classes is properly controlled. 2) We provide an off-policy framework called the two-stage stochastic gradient (TSSG), which can efficiently solve GAIL based on the soft policy iteration and attain the sublinear convergence rate to a stationary solution. The comprehensive numerical simulations are illustrated in MuJoCo environments.
引用
收藏
页码:385 / 399
页数:15
相关论文
共 50 条
  • [31] Sample-Efficient Imitation Learning via Generative Adversarial Nets
    Blonde, Lionel
    Kalousis, Alexandros
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [32] Generative Adversarial Imitation Learning from Human Behavior with Reward Shaping
    Li, Jiangeng
    Huang, Shuai
    Xu, Xin
    Zuo, Guoyu
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 6254 - 6259
  • [33] DIVINE: A Generative Adversarial Imitation Learning Framework for Knowledge Graph Reasoning
    Li, Ruiping
    Cheng, Xiang
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 2642 - 2651
  • [34] Exploring Gradient Explosion in Generative Adversarial Imitation Learning: A Probabilistic Perspective
    Wang, Wanying
    Zhu, Yichen
    Zhou, Yirui
    Shen, Chaomin
    Tang, Jian
    Xu, Zhiyuan
    Peng, Yaxin
    Zhang, Yangchun
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15625 - 15633
  • [35] Generative Adversarial Imitation Learning to Search in Branch-and-Bound Algorithms
    Wang, Qi
    Blackley, Suzanne, V
    Tang, Chunlei
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT II, 2022, : 673 - 680
  • [36] Drone Navigation in Unreal Engine Using Generative Adversarial Imitation Learning
    Bandela, Suraj
    Cao, Yongcan
    AIAA SCITECH 2023 FORUM, 2023,
  • [37] AugGAIL : Augmented generative adversarial imitation learning for robotic manipulation tasks
    Jung E.
    Lee S.
    Kim I.
    Journal of Institute of Control, Robotics and Systems, 2020, 26 (05) : 325 - 334
  • [38] When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence
    Guan, Ziwei
    Xu, Tengyu
    Liang, Yingbin
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [39] Weight Adaptive Generative Adversarial Imitation Learning Based on Noise Contrastive Estimation
    Guan, Weifan
    Zhang, Xi
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2023, 36 (04): : 300 - 312
  • [40] Dynamic Economic Dispatch of Power System Based on Generative Adversarial Imitation Learning
    Chen H.
    Meng F.
    Zhang Y.
    Sun Y.
    Zhang J.
    Shan L.
    Lü X.
    Zhang P.
    Dianwang Jishu/Power System Technology, 2022, 46 (11): : 4373 - 4380