Provably Efficient Adversarial Imitation Learning with Unknown Transitions

被引:0
|
作者
Xu, Tian [1 ,4 ]
Li, Ziniu [2 ,3 ]
Yu, Yang [1 ,4 ]
Luo, Zhi-Quan [2 ,3 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Chinese Univ Hong Kong, Shenzhen, Peoples R China
[3] Shenzhen Res Inst Big Data, Shenzhen, Peoples R China
[4] Polixir Ai, Nanjing, Peoples R China
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imitation learning (IL) has proven to be an effective method for learning good policies from expert demonstrations. Adversarial imitation learning (AIL), a subset of IL methods, is particularly promising, but its theoretical foundation in the presence of unknown transitions has yet to be fully developed. This paper explores the theoretical underpinnings of AIL in this context, where the stochastic and uncertain nature of environment transitions presents a challenge. We examine the expert sample complexity and interaction complexity required to recover good policies. To this end, we establish a framework connecting reward-free exploration and AIL, and propose an algorithm, MB-TAIL, that achieves the minimax optimal expert sample complexity of (O) over tilde (H-3/2|S|/epsilon) and interaction complexity of (O) over tilde (H-3 |S|(2) |A|/epsilon(2)). Here, H represents the planning horizon, jSj is the state space size, |A| is the action space size, and epsilon is the desired imitation gap. MB-TAIL is the first algorithm to achieve this level of expert sample complexity in the unknown transition setting and improves upon the interaction complexity of the best-known algorithm, OAL, by O (H). Additionally, we demonstrate the generalization ability of MB-TAIL by extending it to the function approximation setting and proving that it can achieve expert sample and interaction complexity independent of |S|.
引用
收藏
页码:2367 / 2378
页数:12
相关论文
共 50 条
  • [31] Emergence of Chaotic Time Series by Adversarial Imitation Learning
    Yamazaki, Seiya
    Iizuka, Hiroyuki
    Yamamoto, Masahito
    2018 CONFERENCE ON ARTIFICIAL LIFE (ALIFE 2018), 2018, : 659 - 664
  • [32] Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach
    Liao, Luofeng
    Chen, You-Lin
    Yang, Zhuoran
    Dai, Bo
    Wang, Zhaoran
    Kolar, Mladen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [33] Generative Adversarial Imitation Learning from Failed Experiences
    Zhu, Jiacheng
    Lin, Jiahao
    Wang, Meng
    Chen, Yingfeng
    Fan, Changjie
    Jiang, Chong
    Zhang, Zongzhang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13997 - 13998
  • [34] Ranking-Based Generative Adversarial Imitation Learning
    Shi, Zhipeng
    Zhang, Xuehe
    Fang, Yu
    Li, Changle
    Liu, Gangfeng
    Zhao, Jie
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8967 - 8974
  • [35] Visual Adversarial Imitation Learning using Variational Models
    Rafailov, Rafael
    Yu, Tianhe
    Rajeswaran, Aravind
    Finn, Chelsea
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [36] Complexity of bird song caused by adversarial imitation learning
    Yamazaki, Seiya
    Iizuka, Hiroyuki
    Yamamoto, Masahito
    ARTIFICIAL LIFE AND ROBOTICS, 2020, 25 (01) : 124 - 132
  • [37] Adversarial Imitation Learning with Controllable Rewards for Text Generation
    Nishikino, Keizaburo
    Kobayashi, Kenichi
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT I, 2023, 14169 : 131 - 146
  • [38] Multimodal Storytelling via Generative Adversarial Imitation Learning
    Chen, Zhiqian
    Zhang, Xuchao
    Boedihardjo, Arnold P.
    Dai, Jing
    Lu, Chang-Tien
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3967 - 3973
  • [39] Multi-Agent Generative Adversarial Imitation Learning
    Song, Jiaming
    Ren, Hongyu
    Sadigh, Dorsa
    Ermon, Stefano
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [40] ARC - Actor Residual Critic for Adversarial Imitation Learning
    Deka, Ankur
    Liu, Changliu
    Sycara, Katia
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1446 - 1456