Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration
被引:49
|
作者:
Huang, De-An
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USAStanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
Huang, De-An
[1
]
Nair, Suraj
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USAStanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
Nair, Suraj
[1
]
Xu, Danfei
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USAStanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
Xu, Danfei
[1
]
Zhu, Yuke
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USAStanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
Zhu, Yuke
[1
]
Garg, Animesh
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USAStanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
Garg, Animesh
[1
]
Li Fei-Fei
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USAStanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
Li Fei-Fei
[1
]
Savarese, Silvio
论文数: 0引用数: 0
h-index: 0
机构:
Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USAStanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
Savarese, Silvio
[1
]
论文数: 引用数:
h-index:
机构:
Niebles, Juan Carlos
[1
]
机构:
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
来源:
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019)
|
2019年
关键词:
D O I:
10.1109/CVPR.2019.00876
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Our goal is to generate a policy to complete an unseen task given just a single video demonstration of the task in a given domain. We hypothesize that to successfully generalize to unseen complex tasks from a single video demonstration, it is necessary to explicitly incorporate the compositional structure of the tasks into the model. To this end, we propose Neural Task Graph (NTG) Networks, which use conjugate task graph as the intermediate representation to modularize both the video demonstration and the derived policy. We empirically show NTG achieves inter-task generalization on two complex tasks: Block Stacking in BulletPhysics and Object Collection in AI2-THOR. NTG improves data efficiency with visual input as well as achieve strong generalization without the need for dense hierarchical supervision. We further show that similar performance trends hold when applied to real-world data. We show that NTG can effectively predict task structure on the JIGSAWS surgical dataset and generalize to unseen tasks.