Provably Efficient Adversarial Imitation Learning with Unknown Transitions

被引:0
|
作者
Xu, Tian [1 ,4 ]
Li, Ziniu [2 ,3 ]
Yu, Yang [1 ,4 ]
Luo, Zhi-Quan [2 ,3 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
[2] Chinese Univ Hong Kong, Shenzhen, Peoples R China
[3] Shenzhen Res Inst Big Data, Shenzhen, Peoples R China
[4] Polixir Ai, Nanjing, Peoples R China
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imitation learning (IL) has proven to be an effective method for learning good policies from expert demonstrations. Adversarial imitation learning (AIL), a subset of IL methods, is particularly promising, but its theoretical foundation in the presence of unknown transitions has yet to be fully developed. This paper explores the theoretical underpinnings of AIL in this context, where the stochastic and uncertain nature of environment transitions presents a challenge. We examine the expert sample complexity and interaction complexity required to recover good policies. To this end, we establish a framework connecting reward-free exploration and AIL, and propose an algorithm, MB-TAIL, that achieves the minimax optimal expert sample complexity of (O) over tilde (H-3/2|S|/epsilon) and interaction complexity of (O) over tilde (H-3 |S|(2) |A|/epsilon(2)). Here, H represents the planning horizon, jSj is the state space size, |A| is the action space size, and epsilon is the desired imitation gap. MB-TAIL is the first algorithm to achieve this level of expert sample complexity in the unknown transition setting and improves upon the interaction complexity of the best-known algorithm, OAL, by O (H). Additionally, we demonstrate the generalization ability of MB-TAIL by extending it to the function approximation setting and proving that it can achieve expert sample and interaction complexity independent of |S|.
引用
收藏
页码:2367 / 2378
页数:12
相关论文
共 50 条
  • [21] Provably Efficient Third-Person Imitation from Offline Observation
    Zweig, Aaron
    Bruna, Joan
    CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 1228 - 1237
  • [22] Self-Supervised Adversarial Imitation Learning
    Monteiro, Juarez
    Gavenski, Nathan
    Meneguzzi, Felipe
    Barros, Rodrigo C.
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [23] Randomized Adversarial Imitation Learning for Autonomous Driving
    Shin, MyungJae
    Kim, Joongheon
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4590 - 4596
  • [24] Combating False Negatives in Adversarial Imitation Learning
    Zolna, Konrad
    Saharia, Chitwan
    Boussioux, Leonard
    Hui, David Yu-Tung
    Chevalier-Boisvert, Maxime
    Bahdanau, Dzmitry
    Bengio, Yoshua
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13999 - 14000
  • [25] Adversarial Imitation Learning from Incomplete Demonstrations
    Sun, Mingfei
    Xiaojuan
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3513 - 3519
  • [26] Provably Efficient Learning of Transferable Rewards
    Metelli, Alberto Maria
    Ramponi, Giorgia
    Concetti, Alessandro
    Restelli, Marcello
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [27] Is Q-learning Provably Efficient?
    Jin, Chi
    Allen-Zhu, Zeyuan
    Bubeck, Sebastien
    Jordan, Michael I.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [28] Robot Manipulation Learning Using Generative Adversarial Imitation Learning
    Jabri, Mohamed Khalil
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4893 - 4894
  • [29] A Survey of Imitation Learning Based on Generative Adversarial Nets
    Lin J.-H.
    Zhang Z.-Z.
    Jiang C.
    Hao J.-Y.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (02): : 326 - 351
  • [30] End-to-End Differentiable Adversarial Imitation Learning
    Baram, Nir
    Anschel, Oron
    Caspi, Itai
    Mannor, Shie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70