Hierarchical Imitation Learning for Stochastic Environments

被引：0

作者：

Igl, Maximilian ^{[1
]}

Shah, Punit ^{[1
]}

Mougin, Paul ^{[1
]}

Srinivasan, Sirish ^{[1
]}

Gupta, Tarun ^{[1
]}

White, Brandyn ^{[1
]}

Shiarlis, Kyriacos ^{[1
]}

Whiteson, Shimon ^{[1
]}

机构：

[1] Waymo Res, Mountain View, CA 94043 USA

来源：

2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS | 2023年

关键词：

D O I：

10.1109/IROS55552.2023.10341451

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Many applications of imitation learning require the agent to generate the full distribution of behaviour observed in the training data. For example, to evaluate the safety of autonomous vehicles in simulation, accurate and diverse behaviour models of other road users are paramount. Existing methods that improve this distributional realism typically rely on hierarchical policies. These condition the policy on types such as goals or personas that give rise to multi-modal behaviour. However, such methods are often inappropriate for stochastic environments where the agent must also react to external factors: because agent types are inferred from the observed future trajectory during training, these environments require that the contributions of internal and external factors to the agent behaviour are disentangled and only internal factors, i.e., those under the agent's control, are encoded in the type. Encoding future information about external factors leads to inappropriate agent reactions during testing, when the future is unknown and types must be drawn independently from the actual future. We formalize this challenge as distribution shift in the conditional distribution of agent types under environmental stochasticity. We propose Robust Type Conditioning (RTC), which eliminates this shift with adversarial training under randomly sampled types. Experiments on two domains, including the large-scale Waymo Open Motion Dataset, show improved distributional realism while maintaining or improving task performance compared to state-of-the-art baselines.

引用

页码：1697 / 1704

页数：8

共 50 条

[21] Hierarchical Imitation Learning via Subgoal Representation Learning for Dynamic Treatment Recommendation
Wang, Lu
Tang, Ruiming
He, Xiaofeng
He, Xiuqiang
WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 1081 - 1089
[22] CLIC: Curriculum Learning and Imitation for Object Control in Nonrewarding Environments
Fournier, Pierre
Colas, Cedric
Chetouani, Mohamed
Sigaud, Olivier
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (02) : 239 - 248
[23] Deep Imitation Learning for Autonomous Navigation in Dynamic Pedestrian Environments
Qin, Lei
Huang, Zefan
Zhang, Chen
Guo, Hongliang
Ang, Marcelo, Jr.
Rus, Daniela
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 4108 - 4115
[24] STOCHASTIC-PROCESSES IMITATION WITH THE LEARNING ON THE LIMITED DATA SAMPLE
SAVCHENKO, VV
KHOLOPENKOV, SV
RADIOTEKHNIKA I ELEKTRONIKA, 1991, 36 (04): : 828 - 831
[25] HIERARCHICAL LEVELS OF IMITATION
BYRNE, RW
BEHAVIORAL AND BRAIN SCIENCES, 1993, 16 (03) : 516 - 517
[26] A Hierarchical Imitation Learning-based Decision Framework for Autonomous Driving
Liang, Hebin
Dong, Zibin
Ma, Yi
Hao, Xiaotian
Zheng, Yan
Hao, Jianye
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 4695 - 4701
[27] Hierarchical Interpretable Imitation Learning for End-to-End Autonomous Driving
Teng, Siyu
Chen, Long
Ai, Yunfeng
Zhou, Yuanye
Xuanyuan, Zhe
Hu, Xuemin
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (01): : 673 - 683
[28] Interpretable Motion Planner for Urban Driving via Hierarchical Imitation Learning
Wang, Bikun
Wang, Zhipeng
Zhu, Chenhao
Zhang, Zhiqiang
Wang, Zhichen
Lin, Penghong
Liu, Jingchu
Zhang, Qian
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 1691 - 1696
[29] Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving
Bronstein, Eli
Palatucci, Mark
Notz, Dominik
White, Brandyn
Kuefler, Alex
Lu, Yiren
Paul, Supratik
Nikdel, Payam
Mougin, Paul
Chen, Hongge
Fu, Justin
Abrams, Austin
Shah, Punit
Racah, Evan
Frenkel, Benjamin
Whiteson, Shimon
Anguelov, Dragomir
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 8652 - 8659
[30] Symmetrical Hierarchical Stochastic Searching on the Line in Informative and Deceptive Environments
Zhang, Junqi
Wang, Yuheng
Wang, Cheng
Zhou, MengChu
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (03) : 626 - 635

← 1 2 3 4 5 →