Vision-Based Efficient Robotic Manipulation with a Dual-Streaming Compact Convolutional Transformer

被引:2
|
作者
Guo, Hao [1 ]
Song, Meichao [1 ]
Ding, Zhen [1 ]
Yi, Chunzhi [2 ]
Jiang, Feng [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Harbin Inst Technol, Sch Med & Hlth, Harbin 150001, Peoples R China
关键词
bio-inspired design and control of robots; robotics; reinforcement learning; vision transformer; LEVEL;
D O I
10.3390/s23010515
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Learning from visual observation for efficient robotic manipulation is a hitherto significant challenge in Reinforcement Learning (RL). Although the collocation of RL policies and convolution neural network (CNN) visual encoder achieves high efficiency and success rate, the method general performance for multi-tasks is still limited to the efficacy of the encoder. Meanwhile, the increasing cost of the encoder optimization for general performance could debilitate the efficiency advantage of the original policy. Building on the attention mechanism, we design a robotic manipulation method that significantly improves the policy general performance among multitasks with the lite Transformer based visual encoder, unsupervised learning, and data augmentation. The encoder of our method could achieve the performance of the original Transformer with much less data, ensuring efficiency in the training process and intensifying the general multi-task performances. Furthermore, we experimentally demonstrate that the master view outperforms the other alternative third-person views in the general robotic manipulation tasks when combining the third-person and egocentric views to assimilate global and local visual information. After extensively experimenting with the tasks from the OpenAI Gym Fetch environment, especially in the Push task, our method succeeds in 92% versus baselines that of 65%, 78% for the CNN encoder, 81% for the ViT encoder, and with fewer training steps.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] A Dual-Model Vision-based Tactile Sensor for Robotic Hand Grasping
    Fang, Bin
    Sun, Fuchun
    Yang, Chao
    Xue, Hongxiang
    Chen, Wendan
    Zhang, Chun
    Guo, Di
    Liu, Huaping
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 4740 - 4745
  • [32] Efficient vision-based navigation
    Hornung, Armin
    Bennewitz, Maren
    Strasdat, Hauke
    AUTONOMOUS ROBOTS, 2010, 29 (02) : 137 - 149
  • [33] Efficient Use of Bandwidth by Image Compression for Vision-Based Robotic Navigation and Control
    Raghavachari, Chakravartula
    Sundaram, Shanmugha G. A.
    2014 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2014,
  • [34] A Vision-Based Micro-Manipulation System
    Vismanis, Oskars
    Arents, Janis
    Subaciute-Zemaitiene, Jurga
    Bucinskas, Vytautas
    Dzedzickis, Andrius
    Patel, Brijesh
    Tung, Wei-Cheng
    Lin, Po-Ting
    Greitans, Modris
    APPLIED SCIENCES-BASEL, 2023, 13 (24):
  • [35] Vision-based Belt Manipulation by Humanoid Robot
    Qin, Yili
    Escande, Adrien
    Tanguy, Arnaud
    Yoshida, Eiichi
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 3547 - 3552
  • [36] An Efficient Computer Vision-Based Dual-Face Target Precision Variable Spraying Robotic System for Foliar Fertilisers
    Zhu, Chengtian
    Hao, Shuaihua
    Liu, Cailing
    Wang, Yuewei
    Jia, Xuan
    Xu, Jitong
    Guo, Songbao
    Huo, Juxin
    Wang, Weiming
    AGRONOMY-BASEL, 2024, 14 (12):
  • [37] Vision-Based High-Speed Manipulation For Robotic Ultra-Precise Weed Control
    Michaels, Andreas
    Haug, Sebastian
    Albert, Amos
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 5498 - 5505
  • [38] Towards vision-based manipulation of plastic materials
    Cherubini, Andrea
    Leitner, Jurgen
    Ortenzi, Valerio
    Corke, Peter
    2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 485 - 490
  • [39] Vision-based manipulation with the humanoid robot Romeo
    Claudio, Giovanni
    Spindler, Fabien
    Chaumette, Francois
    2016 IEEE-RAS 16TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2016, : 286 - 293
  • [40] Development of vision-based navigation for a robotic wheelchair
    Bailey, Matt
    Chanler, Andrew
    Maxwell, Bruce
    Micire, Mark
    Tsui, Katherine
    Yanco, Holly
    2007 IEEE 10TH INTERNATIONAL CONFERENCE ON REHABILITATION ROBOTICS, VOLS 1 AND 2, 2007, : 951 - +