Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera

被引:2
|
作者
Bao, Jiatong [1 ]
Jia, Yunyi [2 ]
Cheng, Yu [3 ]
Tang, Hongru [1 ]
Xi, Ning [3 ]
机构
[1] Yangzhou Univ, Dept Hydraul Energy & Power Engn, Yangzhou 225127, Jiangsu, Peoples R China
[2] Clemson Univ, Dept Automot Engn, Greenville, SC 29607 USA
[3] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA
来源
SENSORS | 2016年 / 16卷 / 12期
关键词
object grounding; target object detection; object recognition; natural language processing; natural language control; robotic manipulation system; FRAMEWORK;
D O I
10.3390/s16122117
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Controlling robots by natural language (NL) is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications.
引用
收藏
页数:23
相关论文
共 50 条
  • [41] Human Pose Recognition and tracking using RGB-D Camera
    Kahlouche, Souhila
    Ouadah, Noureddine
    Belhocine, Mohmoud
    Boukandoura, Mhamed
    PROCEEDINGS OF 2016 8TH INTERNATIONAL CONFERENCE ON MODELLING, IDENTIFICATION & CONTROL (ICMIC 2016), 2016, : 520 - 525
  • [42] Accurate Pouring with an Autonomous Robot Using an RGB-D Camera
    Do, Chau
    Burgard, Wolfram
    INTELLIGENT AUTONOMOUS SYSTEMS 15, IAS-15, 2019, 867 : 210 - 221
  • [43] Plane-based Odometry using an RGB-D Camera
    Raposo, Carolina
    Lourenco, Miguel
    Barreto, Joao P.
    Antunes, Michel
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2013, 2013,
  • [44] 3D reconstruction and volume measurement of irregular objects based on RGB-D camera
    Zhu, Yu
    Cao, Songxiao
    Song, Tao
    Xu, Zhipeng
    Jiang, Qing
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (12)
  • [45] Indoor Objects 3D Modeling Based on RGB-D Camera for Robot Vision
    Shi, Guangsheng
    Zhao, Lijun
    Wang, Ke
    Gao, Yunfeng
    Liu, Yihuan
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON FLUID POWER AND MECHATRONICS - FPM 2015, 2015, : 750 - 755
  • [46] Global 3D Non-Rigid Registration of Deformable Objects Using a Single RGB-D Camera
    Yang, Jingyu
    Guo, Daoliang
    Li, Kun
    Wu, Zhenchao
    Lai, Yu-Kun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (10) : 4746 - 4761
  • [47] 3-D Mapping With an RGB-D Camera
    Endres, Felix
    Hess, Juergen
    Sturm, Juergen
    Cremers, Daniel
    Burgard, Wolfram
    IEEE TRANSACTIONS ON ROBOTICS, 2014, 30 (01) : 177 - 187
  • [48] LBENet: Lightweight boundary enhancement network for detecting salient objects in RGB-D images
    Gong, Tingting
    Zhou, Wujie
    Qian, Xiaohong
    Lei, Jingsheng
    Yu, Lu
    OPTIK, 2022, 271
  • [49] Towards an Omnidirectional Catadioptric RGB-D Camera
    Iglesias, Jose
    Mirado, Pedro
    Ventura, Rodrigo
    2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 2506 - 2513
  • [50] Modeling Hair from an RGB-D Camera
    Zhang, Meng
    Wu, Pan
    Wu, Hongzhi
    Weng, Yanlin
    Zheng, Youyi
    Zhou, Kun
    SIGGRAPH ASIA'18: SIGGRAPH ASIA 2018 TECHNICAL PAPERS, 2018,