Visual Rewards From Observation for Sequential Tasks: Autonomous Pile Loading

被引：2

作者：

Strokina, Nataliya ^{[1
]}

Yang, Wenyan ^{[1
]}

Pajarinen, Joni ^{[2
]}

Serbenyuk, Nikolay ^{[3
]}

Kaemaeraeinen, Joni ^{[1
]}

Ghabcheloo, Reza ^{[3
]}

机构：

[1] Tampere Univ, Comp Sci, Tampere, Finland

[2] Aalto Univ, Dept Elect Engn & Automation, Espoo, Finland

[3] Tampere Univ, Automation Technol & Mech Engn, Tampere, Finland

来源：

FRONTIERS IN ROBOTICS AND AI | 2022年 / 9卷

基金：

芬兰科学院;

关键词：

visual rewards; learning from demonstration; reinforcement learning; field robotics; earth moving; visual representations;

D O I：

10.3389/frobt.2022.838059

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

One of the key challenges in implementing reinforcement learning methods for real-world robotic applications is the design of a suitable reward function. In field robotics, the absence of abundant datasets, limited training time, and high variation of environmental conditions complicate the task further. In this paper, we review reward learning techniques together with visual representations commonly used in current state-of-the-art works in robotics. We investigate a practical approach proposed in prior work to associate the reward with the stage of the progress in task completion based on visual observation. This approach was demonstrated in controlled laboratory conditions. We study its potential for a real-scale field application, autonomous pile loading, tested outdoors in three seasons: summer, autumn, and winter. In our framework, the cumulative reward combines the predictions about the process stage and the task completion (terminal stage). We use supervised classification methods to train prediction models and investigate the most common state-of-the-art visual representations. We use task-specific contrastive features for terminal stage prediction.

引用

页数：17

共 50 条

[21] Visual Search as a Probabilistic Sequential Decision Process in Software Autonomous System
Ghosh, Aritra
Huang, Shihong
IEEE SOUTHEASTCON 2018, 2018,
[22] Observation Timing Planning Method for Sequential Images of Autonomous Navigation Based on Observability
Li J.
Wang D.
Dong T.
Li M.
Xu C.
Fu F.
Yuhang Xuebao/Journal of Astronautics, 2023, 44 (03): : 411 - 421
[23] Learning similar tasks from observation and practice
Bentivegna, Darrin C.
Atkeson, Christopher G.
Cheng, Gordon
2006 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-12, 2006, : 2677 - 2683
[24] VOILA: Visual-Observation-Only Imitation Learning for Autonomous Navigation
Karnan, Haresh
Warnell, Garrett
Xiao, Xuesu
Stone, Peter
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 2497 - 2503
[25] Autonomous Development of Goals: From Generic Rewards to Goal and Self Detection
Rolf, Matthias
Asada, Minoru
FOUTH JOINT IEEE INTERNATIONAL CONFERENCES ON DEVELOPMENT AND LEARNING AND EPIGENETIC ROBOTICS (IEEE ICDL-EPIROB 2014), 2014, : 187 - 194
[26] An observation of adults with visual impairments carrying out copy-typing tasks
Douglas, G
Long, R
BEHAVIOUR & INFORMATION TECHNOLOGY, 2003, 22 (03) : 141 - 153
[27] Learning sequential constraints of tasks from user demonstrations
Pardowitz, M
Zöllner, R
Dillmann, R
2005 5TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, 2005, : 424 - 429
[28] Novel active vision-based Visual Threat Cue for autonomous navigation tasks
Kundur, SR
Raviv, D
COMPUTER VISION AND IMAGE UNDERSTANDING, 1999, 73 (02) : 169 - 182
[29] Performance of visual tasks from contour information
Yitzhaky, Yitzhak
Itan, Liron
APPLICATIONS OF DIGITAL IMAGE PROCESSING XXXIII, 2010, 7798
[30] Sequential Difficulty Effects in Cognitive and Sensorimotor Tasks: Insights from Arithmetic and Fitts Tasks
Poleti, Celine
Sleimen-Malkoun, Rita
Temprado, Jean-Jacques
Lemaire, Patrick
AMERICAN JOURNAL OF PSYCHOLOGY, 2018, 131 (02): : 161 - 173

← 1 2 3 4 5 →