DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving

被引:1139
作者
Chen, Chenyi [1 ]
Seff, Ari [1 ]
Kornhauser, Alain [1 ]
Xiao, Jianxiong [1 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
来源
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2015年
关键词
D O I
10.1109/ICCV.2015.312
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Today, there are two major paradigms for vision-based autonomous driving systems: mediated perception approaches that parse an entire scene to make a driving decision, and behavior reflex approaches that directly map an input image to a driving action by a regressor. In this paper, we propose a third paradigm: a direct perception approach to estimate the affordance for driving. We propose to map an input image to a small number of key perception indicators that directly relate to the affordance of a road/traffic state for driving. Our representation provides a set of compact yet complete descriptions of the scene to enable a simple controller to drive autonomously. Falling in between the two extremes of mediated perception and behavior reflex, we argue that our direct perception representation provides the right level of abstraction. To demonstrate this, we train a deep Convolutional Neural Network using recording from 12 hours of human driving in a video game and show that our model can work well to drive a car in a very diverse set of virtual environments. We also train a model for car distance estimation on the KITTI dataset. Results show that our direct perception approach can generalize well to real driving images. Source code and data are available on our project website.
引用
收藏
页码:2722 / 2730
页数:9
相关论文
共 21 条
[1]   Real time Detection of Lane Markers in Urban Streets [J].
Aly, Mohamed .
2008 IEEE INTELLIGENT VEHICLES SYMPOSIUM, VOLS 1-3, 2008, :165-170
[2]  
Ben J., 2005, Advances in Neural Information Processing Systems, P1
[3]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[4]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[5]   Scalable Object Detection using Deep Neural Networks [J].
Erhan, Dumitru ;
Szegedy, Christian ;
Toshev, Alexander ;
Anguelov, Dragomir .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2155-2162
[6]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237
[7]   3D Traffic Scene Understanding from Movable Platforms [J].
Geiger, Andreas ;
Lauer, Martin ;
Wojek, Christian ;
Stiller, Christoph ;
Urtasun, Raquel .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (05) :1012-1025
[8]  
Gibson J.J, 1979, The Ecological Approach to Visual Perception, DOI DOI 10.4324/9781315740218
[9]  
Girshick R., 2014, P IEEE C COMPUTER VI, P580, DOI [10.1109/CVPR.2014.81, DOI 10.1109/CVPR.2014.81]
[10]   Learning Long-Range Vision for Autonomous Off-Road Driving [J].
Hadsell, Raia ;
Sermanet, Pierre ;
Ben, Jan ;
Erkan, Ayse ;
Scoffier, Marco ;
Kavukcuoglu, Koray ;
Muller, Urs ;
LeCun, Yann .
JOURNAL OF FIELD ROBOTICS, 2009, 26 (02) :120-144