A computational framework for attentional object discovery in RGB-D videos

被引:0
|
作者
Germán Martín García
Mircea Pavel
Simone Frintrop
机构
[1] University of Bonn,Institute of Computer Science VI
[2] University of Hamburg,Computer Vision Group, Department of Informatics
来源
Cognitive Processing | 2017年 / 18卷
关键词
RGB-D object discovery; Computational visual attention; 3D inhibition of return;
D O I
暂无
中图分类号
学科分类号
摘要
We present a computational framework for attention-guided visual scene exploration in sequences of RGB-D data. For this, we propose a visual object candidate generation method to produce object hypotheses about the objects in the scene. An attention system is used to prioritise the processing of visual information by (1) localising candidate objects, and (2) integrating an inhibition of return (IOR) mechanism grounded in spatial coordinates. This spatial IOR mechanism naturally copes with camera motions and inhibits objects that have already been the target of attention. Our approach provides object candidates which can be processed by higher cognitive modules such as object recognition. Since objects are basic elements for many higher level tasks, our architecture can be used as a first layer in any cognitive system that aims at interpreting a stream of images. We show in the evaluation how our framework finds most of the objects in challenging real-world scenes.
引用
收藏
页码:169 / 182
页数:13
相关论文
共 50 条
  • [31] Learning Coupled Classifiers with RGB images for RGB-D object recognition
    Li, Xiao
    Fang, Min
    Zhang, Ju-Jie
    Wu, Jinqiao
    PATTERN RECOGNITION, 2017, 61 : 433 - 446
  • [32] ObjectFusion: An object detection and segmentation framework with RGB-D SLAM and convolutional neural networks
    Tian, Guanzhong
    Liu, Liang
    Ri, JongHyok
    Liu, Yong
    Sun, Yiran
    NEUROCOMPUTING, 2019, 345 : 3 - 14
  • [33] Estimating Spatial Layout of Rooms from RGB-D Videos
    Wang, Anran
    Lu, Jiwen
    Cai, Jianfei
    Wang, Gang
    Cham, Tat-Jen
    2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
  • [34] Human activity recognition in RGB-D videos by dynamic images
    Mukherjee, Snehasis
    Anvitha, Leburu
    Lahari, T. Mohana
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (27-28) : 19787 - 19801
  • [35] Reconstructing Articulated Rigged Models from RGB-D Videos
    Tzionas, Dimitrios
    Gall, Juergen
    COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 : 620 - 633
  • [36] Semantic segmentation with Recurrent Neural Networks on RGB-D videos
    Gao, Chuan
    Wang, Weihong
    Chen, Mingxi
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1203 - 1207
  • [37] Viewpoint Invariant Action Recognition Using RGB-D Videos
    Liu, Jian
    Akhtar, Naveed
    Mian, Ajmal
    IEEE ACCESS, 2018, 6 : 70061 - 70071
  • [38] Predicting Human Activities in Sequences of Actions in RGB-D Videos
    Jardim, David
    Nunes, Luis
    Dias, Miguel
    NINTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2016), 2017, 10341
  • [39] Recognition and Classification of Human Activity from RGB-D Videos
    Gurkaynak, Deniz
    Yalcin, Hulya
    2015 23RD SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2015, : 1745 - 1748
  • [40] Human activity recognition in RGB-D videos by dynamic images
    Snehasis Mukherjee
    Leburu Anvitha
    T. Mohana Lahari
    Multimedia Tools and Applications, 2020, 79 : 19787 - 19801