Visual Rewards From Observation for Sequential Tasks: Autonomous Pile Loading

被引：2

作者：

Strokina, Nataliya ^{[1
]}

Yang, Wenyan ^{[1
]}

Pajarinen, Joni ^{[2
]}

Serbenyuk, Nikolay ^{[3
]}

Kaemaeraeinen, Joni ^{[1
]}

Ghabcheloo, Reza ^{[3
]}

机构：

[1] Tampere Univ, Comp Sci, Tampere, Finland

[2] Aalto Univ, Dept Elect Engn & Automation, Espoo, Finland

[3] Tampere Univ, Automation Technol & Mech Engn, Tampere, Finland

来源：

FRONTIERS IN ROBOTICS AND AI | 2022年 / 9卷

基金：

芬兰科学院;

关键词：

visual rewards; learning from demonstration; reinforcement learning; field robotics; earth moving; visual representations;

D O I：

10.3389/frobt.2022.838059

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

One of the key challenges in implementing reinforcement learning methods for real-world robotic applications is the design of a suitable reward function. In field robotics, the absence of abundant datasets, limited training time, and high variation of environmental conditions complicate the task further. In this paper, we review reward learning techniques together with visual representations commonly used in current state-of-the-art works in robotics. We investigate a practical approach proposed in prior work to associate the reward with the stage of the progress in task completion based on visual observation. This approach was demonstrated in controlled laboratory conditions. We study its potential for a real-scale field application, autonomous pile loading, tested outdoors in three seasons: summer, autumn, and winter. In our framework, the cumulative reward combines the predictions about the process stage and the task completion (terminal stage). We use supervised classification methods to train prediction models and investigate the most common state-of-the-art visual representations. We use task-specific contrastive features for terminal stage prediction.

引用

页数：17

共 50 条

[1] Autonomous learning of sequential tasks: Experiments and analyzes
Sun, R
Peterson, T
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1998, 9 (06): : 1217 - 1234
[2] An autonomous excavator system for material loading tasks
Zhang, Liangjun
Zhao, Jinxin
Long, Pinxin
Wang, Liyang
Qian, Lingfeng
Lu, Feixiang
Song, Xibin
Manocha, Dinesh
SCIENCE ROBOTICS, 2021, 6 (55)
[3] Neural Network Controller for Autonomous Pile Loading Revised
Yang, Wenyan
Strokina, Nataliya
Serbenyuk, Nikolay
Pajarinen, Joni
Ghabcheloo, Reza
Vihonen, Juho
Aref, Mohammad M.
Kaemaeraeinen, Joni-Kristian
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 2198 - 2204
[4] Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations
Liang, Junchi
Wen, Bowen
Bekris, Kostas
Boularias, Abdeslam
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 8591 - 8597
[5] Sequential effects in rudimentary auditory and visual tasks
Quinlan, PT
Hill, NI
PERCEPTION & PSYCHOPHYSICS, 1999, 61 (02): : 375 - 384
[6] Sequential effects in rudimentary auditory and visual tasks
Philip T. Quinlan
Nicholas I. Hill
Perception & Psychophysics, 1999, 61 : 375 - 384
[7] The Bright and Dark Sides of Performance-Dependent Monetary Rewards: Evidence From Visual Perception Tasks
Qin, Nan
Xue, Jingming
Chen, Chuansheng
Zhang, Mingxia
COGNITIVE SCIENCE, 2020, 44 (03)
[8] Visual sensing for autonomous underwater exploration and intervention tasks
Bonin-Font, Francisco
Oliver, Gabriel
Wirth, Stephan
Massot, Miquel
Negre, Pep Lluis
Beltran, Joan-Pau
OCEAN ENGINEERING, 2015, 93 : 25 - 44
[9] SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards
Krishnan, Sanjay
Garg, Animesh
Liaw, Richard
Thananjeyan, Brijen
Miller, Lauren
Pokorny, Florian T.
Goldberg, Ken
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2019, 38 (2-3): : 126 - 145
[10] Deep Reinforcement Learning for Industrial Insertion Tasks with Visual Inputs and Natural Rewards
Schoettler, Gerrit
Nair, Ashvin
Luo, Jianlan
Bahl, Shikhar
Ojea, Juan Aparicio
Solowjow, Eugen
Levine, Sergey
2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5548 - 5555

← 1 2 3 4 5 →