Learning Robotic Manipulation Tasks via Task Progress Based Gaussian Reward and Loss Adjusted Exploration

被引：9

作者：

Kumra, Sulabh ^{[1
,2
]}

Joshi, Shirin ^{[3
]}

Sahin, Ferat ^{[4
]}

机构：

[1] OSARO Inc, San Francisco, CA 94103 USA

[2] Rochester Inst Technol, Rochester, NY 14623 USA

[3] Siemens Corp, Corp Technol, Berkeley, CA 94703 USA

[4] Rochester Inst Technol, Multiagent Biorobot Lab, Rochester, NY 14623 USA

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2022年 / 7卷 / 01期

关键词：

Robotic manipulation; reinforcement learning; deep learning;

D O I：

10.1109/LRA.2021.3129833

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Multi-step manipulation tasks in unstructured environments are extremely challenging for a robot to learn. Such tasks interlace high-level reasoning that consists of the expected states that can he attained to achieve an overall task and low-level reasoning that decides what actions will yield these states. We propose a model-free deep reinforcement learning method to learn multi-step manipulation tasks. We introduce a Robotic Manipulation Network (RoManNet)(1), which is a vision-based model architecture, to learn the action-value functions and predict manipulation action candidates. We define a Task Progress based Gaussian (TPG) reward function that computes the reward based on actions that lead to successful motion primitives and progress towards the overall task goal. To balance the ratio of exploration/exploitation, we introduce a Loss Adjusted Exploration (LAE) policy that determines actions from the action candidates according to the Boltzmann distribution of loss estimates. We demonstrate the effectiveness of our approach by training RoManNet to learn several challenging multi-step robotic manipulation tasks in both simulation and real-world. Experimental results show that our method outperforms the existing methods and achieves state-of-the-art performance in terms of success rate and action efficiency. The ablation studies show that TPG and LAE are especially beneficial for tasks like multiple block stacking.

引用

页码：534 / 541

页数：8

共 17 条

[1] Curriculum Learning Algorithms for Reward Weighting in Sparse Reward Robotic Manipulation Tasks
Fele, Benjamin
Babic, Jan
IEEE ACCESS, 2025, 13 : 45544 - 45558
[2] Dexterous robotic manipulation using deep reinforcement learning and knowledge transfer for complex sparse reward-based tasks
Wang, Qiang
Sanchez, Francisco Roldan
McCarthy, Robert
Bulens, David Cordova
McGuinness, Kevin
O'Connor, Noel
Wuthrich, Manuel
Widmaier, Felix
Bauer, Stefan
Redmond, Stephen J.
EXPERT SYSTEMS, 2023, 40 (06)
[3] Solving Robotic Manipulation With Sparse Reward Reinforcement Learning Via Graph-Based Diversity and Proximity
Bing, Zhenshan
Zhou, Hongkuan
Li, Rui
Su, Xiaojie
Morin, Fabrice O.
Huang, Kai
Knoll, Alois
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (03) : 2759 - 2769
[4] Learning vision-based robotic manipulation tasks sequentially in offline reinforcement learning settings
Yadav, Sudhir Pratap
Nagar, Rajendra
Shah, Suril V.
ROBOTICA, 2024, 42 (06) : 1715 - 1730
[5] Active Reward Learning for Co-Robotic Vision Based Exploration in Bandwidth Limited Environments
Jamieson, Stewart
How, Jonathan P.
Girdhar, Yogesh
2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 1806 - 1812
[6] Multi-Level Task Learning Based on Intention and Constraint Inference for Autonomous Robotic Manipulation
Willibald, Christoph
Lee, Dongheui
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 7688 - 7695
[7] Improving Robotic Grasping on Monocular Images Via Multi-Task Learning and Positional Loss
Prew, William
Breckon, Toby
Bordewich, Magnus
Beierholm, Ulrik
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9843 - 9850
[8] Expert-Trajectory-Based Features for Apprenticeship Learning via Inverse Reinforcement Learning for Robotic Manipulation
Naranjo-Campos, Francisco J.
Victores, Juan G.
Balaguer, Carlos
APPLIED SCIENCES-BASEL, 2024, 14 (23):
[9] Robot Learning of Assistive Manipulation Tasks by Demonstration via Head Gesture-based Interface
Kyrarini, Maria
Zheng, Quan
Haseeb, Muhammad Abdul
Graeser, Axel
2019 IEEE 16TH INTERNATIONAL CONFERENCE ON REHABILITATION ROBOTICS (ICORR), 2019, : 1139 - 1146
[10] Reward Shaping to Learn Natural Object Manipulation With an Anthropomorphic Robotic Hand and Hand Pose Priors via On-Policy Reinforcement Learning
Rivera, Patricio
Oh, Jiheon
Valarezo, Edwin
Ryu, Gahyeon
Jung, Hwanseok
Lee, Jin Hyunk
Jeong, Jin Gyun
Kim, Tae-Seong
12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 167 - 171

← 1 2 →