Generalization and Computation for Policy Classes of Generative Adversarial Imitation Learning

被引:3
|
作者
Zhou, Yirui [1 ]
Zhang, Yangchun [1 ]
Liu, Xiaowei [1 ]
Wang, Wanying [1 ]
Che, Zhengping [2 ]
Xu, Zhiyuan [2 ]
Tang, Jian [2 ]
Peng, Yaxin [1 ]
机构
[1] Shanghai Univ, Sch Sci, Dept Math, Shanghai 200444, Peoples R China
[2] Midea Grp, AI Innovat Ctr, Shanghai 201702, Peoples R China
关键词
Generative adversarial imitation learning; Generalization; Computation; Policy classes;
D O I
10.1007/978-3-031-14714-2_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generative adversarial imitation learning (GAIL) learns an optimal policy by expert demonstrations from the environment with unknown reward functions. Different from existing works that studied the generalization of reward function classes or discriminator classes, we focus on policy classes. This paper investigates the generalization and computation for policy classes of GAIL. Specifically, our contributions lie in: 1) We prove that the generalization is guaranteed in GAIL when the complexity of policy classes is properly controlled. 2) We provide an off-policy framework called the two-stage stochastic gradient (TSSG), which can efficiently solve GAIL based on the soft policy iteration and attain the sublinear convergence rate to a stationary solution. The comprehensive numerical simulations are illustrated in MuJoCo environments.
引用
收藏
页码:385 / 399
页数:15
相关论文
共 50 条
  • [41] xGAIL: Explainable Generative Adversarial Imitation Learning for Explainable Human Decision Analysis
    Pan, Menghai
    Huang, Weixiao
    Li, Yanhua
    Zhou, Xun
    Luo, Jun
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1334 - 1343
  • [42] Independent Generative Adversarial Self-Imitation Learning in Cooperative Multiagent Systems
    Hao, Xiaotian
    Wang, Weixun
    Hao, Jianye
    Yang, Yaodong
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1315 - 1323
  • [43] MULTITASK GENERATIVE ADVERSARIAL IMITATION LEARNING FOR MULTI-DOMAIN DIALOGUE SYSTEM
    Hsu, Chuan-En
    Rohmatillah, Mahdin
    Chien, Jen-Tzung
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 954 - 961
  • [44] Collaborative Robot-Assisted Endovascular Catheterization with Generative Adversarial Imitation Learning
    Chi, Wenqiang
    Dagnino, Giulio
    Kwok, Trevor M. Y.
    Anh Nguyen
    Kundrat, Dennis
    Abdelaziz, Mohamed E. M. K.
    Riga, Celia
    Bicknell, Colin
    Yang, Guang-Zhong
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 2414 - 2420
  • [45] Generative Adversarial Imitation Learning Based Bicycle Behaviors Simulation on Road Segments
    Wei, Shuqiao
    Ni, Ying
    Sun, Jian
    Qiu, Hongtong
    Jiaotong Yunshu Xitong Gongcheng Yu Xinxi/Journal of Transportation Systems Engineering and Information Technology, 2024, 24 (04): : 105 - 115
  • [46] Restored Action Generative Adversarial Imitation Learning from observation for robot manipulator
    Park, Jongcheon
    Han, Seungyong
    Lee, S. M.
    ISA TRANSACTIONS, 2022, 129 : 684 - 690
  • [47] Goal Conditioned Generative Adversarial Imitation Learning Based on Dueling-DQN
    Xu, Ziqi
    Wang, Shaofan
    Li, Ke
    PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 2365 - 2378
  • [48] A Mixed Generative Adversarial Imitation Learning Based Vehicle Path Planning Algorithm
    Yang, Zan
    Nai, Wei
    Li, Dan
    Liu, Lu
    Chen, Ziyu
    IEEE ACCESS, 2024, 12 : 85859 - 85879
  • [49] Generative adversarial interactive imitation learning for path following of autonomous underwater vehicle
    Jiang, Dong
    Huang, Jie
    Fang, Zheng
    Cheng, Chunxi
    Sha, Qixin
    He, Bo
    Li, Guangliang
    OCEAN ENGINEERING, 2022, 260
  • [50] TrajGAIL: Generating urban vehicle trajectories using generative adversarial imitation learning
    Choi, Seongjin
    Kim, Jiwon
    Yeo, Hwasoo
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 128