SGL: Symbolic Goal Learning in a Hybrid, Modular Framework for Human Instruction Following

被引：0

作者：

Xu, Ruinian ^{[1
]}

Chen, Hongyi ^{[1
]}

Lin, Yunzhi ^{[1
]}

Vela, Patricio A. ^{[1
]}

机构：

[1] Georgia Inst Technol, Inst Robot & Intelligent Machines, Atlanta, GA 30332 USA

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2022年 / 7卷 / 04期

基金：

美国国家科学基金会;

关键词：

Deep learning in grasping and manipulation; AI-enabled robotics; representation learning;

D O I：

10.1109/LRA.2022.3190076

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

This paper investigates human instruction following for robotic manipulation via a hybrid, modular system with symbolic and connectionist elements. Symbolic methods build modular systems with semantic parsing and task planning modules for producing sequences of actions from natural language requests. Modern connectionist methods employ deep neural networks that learn visual and linguistic features for mapping inputs to a sequence of low-level actions, in an end-to-end fashion. The hybrid, modular system blends these two approaches to create a modular framework: it formulates instruction following as symbolic goal learning via deep neural networks followed by task planning via symbolic planners. Connectionist and symbolic modules are bridged with Planning Domain Definition Language. The vision-and-language learning network predicts its goal representation, which is sent to a planner for producing a task-completing action sequence. For improving the flexibility of natural language, we further incorporate implicit human intents with explicit human instructions. To learn generic features for vision and language, we propose to separately pretrain vision and language encoders on scene graph parsing and semantic textual similarity tasks. Benchmarking evaluates the impacts of different components of, or options for, the vision-and-language learning model and shows the effectiveness of pretraining strategies. Manipulation experiments conducted in the simulator AI2THOR show the robustness of the framework to novel scenarios.

引用

页码：10375 / 10382

页数：8

共 21 条

[21] Deepm5C: A deep-learning-based hybrid framework for identifying human RNA N5-methylcytosine sites using a stacking strategy
Hasan, Md Mehedi
Tsukiyama, Sho
Cho, Jae Youl
Kurata, Hiroyuki
Alam, Md Ashad
Liu, Xiaowen
Manavalan, Balachandran
Deng, Hong-Wen
MOLECULAR THERAPY, 2022, 30 (08) : 2856 - 2867

← 1 2 3 →