Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera

被引:2
|
作者
Bao, Jiatong [1 ]
Jia, Yunyi [2 ]
Cheng, Yu [3 ]
Tang, Hongru [1 ]
Xi, Ning [3 ]
机构
[1] Yangzhou Univ, Dept Hydraul Energy & Power Engn, Yangzhou 225127, Jiangsu, Peoples R China
[2] Clemson Univ, Dept Automot Engn, Greenville, SC 29607 USA
[3] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA
来源
SENSORS | 2016年 / 16卷 / 12期
关键词
object grounding; target object detection; object recognition; natural language processing; natural language control; robotic manipulation system; FRAMEWORK;
D O I
10.3390/s16122117
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Controlling robots by natural language (NL) is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Arbitrarily Shaped Objects Relighting Using an RGB-D Camera
    Ikeda, Takuya
    de Sorbier, Francois
    Saito, Hideo
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 631 - 636
  • [2] Diminishing Real Objects and Adding Virtual Objects Using a RGB-D Camera
    Sasanuma, Hajime
    Manabe, Yoshitsugu
    Yata, Noriko
    ADJUNCT PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR-ADJUNCT), 2016, : 117 - 120
  • [3] Tracking of Non-Rigid Objects using RGB-D Camera
    Sengupta, Agniva
    Krupa, Alexandre
    Marchand, Eric
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3310 - 3317
  • [4] 3D Pose Estimation of Daily Objects Using an RGB-D Camera
    Choi, Changhyun
    Christensen, Henrik I.
    2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2012, : 3342 - 3349
  • [5] Spatiotemporal Texture Reconstruction for Dynamic Objects Using a Single RGB-D Camera
    Kim, Hyomin
    Kim, Jungeon
    Nam, Hyeonseo
    Park, Jaesik
    Lee, Seungyong
    COMPUTER GRAPHICS FORUM, 2021, 40 (02) : 523 - 535
  • [6] RELIABLY DETECTING HUMANS IN CROWDED AND DYNAMIC ENVIRONMENTS USING RGB-D CAMERA
    Tian, Luchao
    Zhang, Guyue
    Li, Mingchen
    Liu, Jun
    Chen, Yan Qiu
    2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2016,
  • [7] Robotic Grasping of Target Objects Based on Semi Automated Annotation Approach with RGB-D Camera
    Deng, Haonan
    Wei, Yuzhang
    Xu, Qingsong
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 973 - 978
  • [8] Detecting and tracking people in real time with RGB-D camera
    Liu, Jun
    Liu, Ye
    Zhang, Guyue
    Zhu, Peiru
    Chen, Yan Qiu
    PATTERN RECOGNITION LETTERS, 2015, 53 : 16 - 23
  • [9] Image retargeting using RGB-D camera
    Wei-Yang Lin
    Chih-Fong Tsai
    Pei-Chen Wu
    Bo-Rong Chen
    Multimedia Tools and Applications, 2015, 74 : 3155 - 3170
  • [10] Image retargeting using RGB-D camera
    Lin, Wei-Yang
    Tsai, Chih-Fong
    Wu, Pei-Chen
    Chen, Bo-Rong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (09) : 3155 - 3170