Vision-Based Efficient Robotic Manipulation with a Dual-Streaming Compact Convolutional Transformer

被引:2
|
作者
Guo, Hao [1 ]
Song, Meichao [1 ]
Ding, Zhen [1 ]
Yi, Chunzhi [2 ]
Jiang, Feng [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[2] Harbin Inst Technol, Sch Med & Hlth, Harbin 150001, Peoples R China
关键词
bio-inspired design and control of robots; robotics; reinforcement learning; vision transformer; LEVEL;
D O I
10.3390/s23010515
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Learning from visual observation for efficient robotic manipulation is a hitherto significant challenge in Reinforcement Learning (RL). Although the collocation of RL policies and convolution neural network (CNN) visual encoder achieves high efficiency and success rate, the method general performance for multi-tasks is still limited to the efficacy of the encoder. Meanwhile, the increasing cost of the encoder optimization for general performance could debilitate the efficiency advantage of the original policy. Building on the attention mechanism, we design a robotic manipulation method that significantly improves the policy general performance among multitasks with the lite Transformer based visual encoder, unsupervised learning, and data augmentation. The encoder of our method could achieve the performance of the original Transformer with much less data, ensuring efficiency in the training process and intensifying the general multi-task performances. Furthermore, we experimentally demonstrate that the master view outperforms the other alternative third-person views in the general robotic manipulation tasks when combining the third-person and egocentric views to assimilate global and local visual information. After extensively experimenting with the tasks from the OpenAI Gym Fetch environment, especially in the Push task, our method succeeds in 92% versus baselines that of 65%, 78% for the CNN encoder, 81% for the ViT encoder, and with fewer training steps.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Vision-Based Measurement and Prediction of Object Trajectory for Robotic Manipulation in Dynamic and Uncertain Scenarios
    Xia, Chongkun
    Weng, Ching-Yen
    Zhang, Yunzhou
    Chen, I-Ming
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2020, 69 (11) : 8939 - 8952
  • [22] Automated Object Manipulation Using Vision-Based Mobile Robotic System for Construction Applications
    Asadi, Khashayar
    Haritsa, Varun R.
    Han, Kevin
    Ore, John-Paul
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2021, 35 (01)
  • [23] Vision-Based Robotic Manipulation of Intelligent Wheelchair with Human-Computer Shared Control
    Du, Siyi
    Wang, Fei
    Zhou, Guilin
    Li, Jiaqi
    Yang, Lintao
    Wang, Dongxu
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3252 - 3257
  • [24] Binocular vision-based 3-D trajectory following for autonomous robotic manipulation
    Chang, Wen-Chung
    ROBOTICA, 2007, 25 : 615 - 626
  • [25] Learning vision-based robotic manipulation tasks sequentially in offline reinforcement learning settings
    Yadav, Sudhir Pratap
    Nagar, Rajendra
    Shah, Suril V.
    ROBOTICA, 2024, 42 (06) : 1715 - 1730
  • [26] COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction
    Ma, Qihang
    Tan, Xin
    Qu, Yanyun
    Ma, Lizhuang
    Zhang, Zhizhong
    Xie, Yuan
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 19936 - 19945
  • [27] A Vision-Based Robotic Follower Vehicle
    Giesbrecht, Jared L.
    Goi, Hien K.
    Barfoot, Timothy D.
    Francis, Bruce A.
    UNMANNED SYSTEMS TECHNOLOGY XI, 2009, 7332
  • [28] Vision-based robotic convoy driving
    Schneiderman, H
    Nashman, R
    Wavering, A
    Lumia, R
    MACHINE VISION AND APPLICATIONS, 1995, 8 (06) : 359 - 364
  • [29] Efficient Vision-Based Face Image Manipulation Identification Framework Based on Deep Learning
    Minh Dang
    ELECTRONICS, 2022, 11 (22)
  • [30] Vision-based robotic grasping with constraints for robotic demanufacturing
    Shoaib, Mohammad Mahin
    Thant, Maung
    Zhou, ChuangChuang
    Peeters, Jef
    Kellens, Karel
    2024 ELECTRONICS GOES GREEN 2024+, EGG 2024, 2024,