A computational framework for attentional object discovery in RGB-D videos

被引：0

作者：

Germán Martín García

Mircea Pavel

Simone Frintrop

机构：

[1] University of Bonn,Institute of Computer Science VI

[2] University of Hamburg,Computer Vision Group, Department of Informatics

来源：

Cognitive Processing | 2017年 / 18卷

关键词：

RGB-D object discovery; Computational visual attention; 3D inhibition of return;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present a computational framework for attention-guided visual scene exploration in sequences of RGB-D data. For this, we propose a visual object candidate generation method to produce object hypotheses about the objects in the scene. An attention system is used to prioritise the processing of visual information by (1) localising candidate objects, and (2) integrating an inhibition of return (IOR) mechanism grounded in spatial coordinates. This spatial IOR mechanism naturally copes with camera motions and inhibits objects that have already been the target of attention. Our approach provides object candidates which can be processed by higher cognitive modules such as object recognition. Since objects are basic elements for many higher level tasks, our architecture can be used as a first layer in any cognitive system that aims at interpreting a stream of images. We show in the evaluation how our framework finds most of the objects in challenging real-world scenes.

引用

页码：169 / 182

页数：13

共 50 条

[31] Learning Coupled Classifiers with RGB images for RGB-D object recognition
Li, Xiao
Fang, Min
Zhang, Ju-Jie
Wu, Jinqiao
PATTERN RECOGNITION, 2017, 61 : 433 - 446
[32] ObjectFusion: An object detection and segmentation framework with RGB-D SLAM and convolutional neural networks
Tian, Guanzhong
Liu, Liang
Ri, JongHyok
Liu, Yong
Sun, Yiran
NEUROCOMPUTING, 2019, 345 : 3 - 14
[33] Estimating Spatial Layout of Rooms from RGB-D Videos
Wang, Anran
Lu, Jiwen
Cai, Jianfei
Wang, Gang
Cham, Tat-Jen
2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
[34] Human activity recognition in RGB-D videos by dynamic images
Mukherjee, Snehasis
Anvitha, Leburu
Lahari, T. Mohana
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (27-28) : 19787 - 19801
[35] Reconstructing Articulated Rigged Models from RGB-D Videos
Tzionas, Dimitrios
Gall, Juergen
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 620 - 633
[36] Semantic segmentation with Recurrent Neural Networks on RGB-D videos
Gao, Chuan
Wang, Weihong
Chen, Mingxi
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1203 - 1207
[37] Viewpoint Invariant Action Recognition Using RGB-D Videos
Liu, Jian
Akhtar, Naveed
Mian, Ajmal
IEEE ACCESS, 2018, 6 : 70061 - 70071
[38] Predicting Human Activities in Sequences of Actions in RGB-D Videos
Jardim, David
Nunes, Luis
Dias, Miguel
NINTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2016), 2017, 10341
[39] Recognition and Classification of Human Activity from RGB-D Videos
Gurkaynak, Deniz
Yalcin, Hulya
2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 1745 - 1748
[40] Human activity recognition in RGB-D videos by dynamic images
Snehasis Mukherjee
Leburu Anvitha
T. Mohana Lahari
Multimedia Tools and Applications, 2020, 79 : 19787 - 19801

← 1 2 3 4 5 →