Adversarial Imitation Learning from Incomplete Demonstrations

被引：0

作者：

Sun, Mingfei ^{[1
]}

Xiaojuan ^{[1
]}

机构：

[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China

来源：

PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Imitation learning targets deriving a mapping from states to actions, a.k.a. policy, from expert demonstrations. Existing methods for imitation learning typically require any actions in the demonstrations to be fully available, which is hard to ensure in real applications. Though algorithms for learning with unobservable actions have been proposed, they focus solely on state information and overlook the fact that the action sequence could still be partially available and provide useful information for policy deriving. In this paper, we propose a novel algorithm called Action-Guided Adversarial Imitation Learning (AGAIL) that learns a policy from demonstrations with incomplete action sequences, i.e., incomplete demonstrations. The core idea of AGAIL is to separate demonstrations into state and action trajectories, and train a policy with state trajectories while using actions as auxiliary information to guide the training whenever applicable. Built upon the Generative Adversarial Imitation Learning, AGAIL has three components: a generator, a discriminator, and a guide. The generator learns a policy with rewards provided by the discriminator, which tries to distinguish state distributions between demonstrations and samples generated by the policy. The guide provides additional rewards to the generator when demonstrated actions for specific states are available. We compare AGAIL to other methods on benchmark tasks and show that AGAIL consistently delivers comparable performance to the state-of-the-art methods even when the action sequence in demonstrations is only partially available.

引用

页码：3513 / 3519

页数：7

共 50 条

[41] Sample-efficient Adversarial Imitation Learning
Jung, Dahuin
Lee, Hyungyu
Yoon, Sungroh
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
[42] Combating False Negatives in Adversarial Imitation Learning
Zolna, Konrad
Saharia, Chitwan
Boussioux, Leonard
Hui, David Yu-Tung
Chevalier-Boisvert, Maxime
Bahdanau, Dzmitry
Bengio, Yoshua
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[43] Adversarial Imitation Learning via Random Search
Shin, MyungJae
Kim, Joongheon
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[44] Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations
Wang, Yanwei
Figueroa, Nadia
Li, Shen
Shah, Ankit
Shah, Julie
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 94 - 105
[45] The Art of Imitation: Learning Long-Horizon Manipulation Tasks From Few Demonstrations
von Hartz, Jan Ole
Welschehold, Tim
Valada, Abhinav
Boedecker, Joschka
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (12): : 11369 - 11376
[46] Variational Adversarial Kernel Learned Imitation Learning
Yang, Fan
Vereshchaka, Alma
Zhou, Yufan
Chen, Changyou
Dong, Wen
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6599 - 6606
[47] Adversarial Imitation Learning with Trajectorial Augmentation and Correction
Antotsiou, Dafni
Ciliberto, Carlo
Kim, Tae-Kyun
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4724 - 4730
[48] Sample-efficient Adversarial Imitation Learning
Jung, Dahuin
Lee, Hyungyu
Yoon, Sungroh
Journal of Machine Learning Research, 2024, 25 : 1 - 32
[49] Self-Supervised Adversarial Imitation Learning
Monteiro, Juarez
Gavenski, Nathan
Meneguzzi, Felipe
Barros, Rodrigo C.
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[50] An imitation learning framework for generating multi-modal trajectories from unstructured demonstrations
Peng, Jian-Wei
Hu, Min-Chun
Chu, Wei-Ta
NEUROCOMPUTING, 2022, 500 : 712 - 723

← 1 2 3 4 5 →