A Deep Model of Visual Attention for Saliency Detection on 3D Objects

被引:2
|
作者
Rouhafzay, Ghazal [1 ]
Cretu, Ana-Maria [2 ]
Payeur, Pierre [1 ]
机构
[1] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON, Canada
[2] Univ Quebec & Outaouais, Dept Engn & Comp Sci, Gatineau, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Visual processing; Eye fixation; 3D shapes; Convolutional neural network; Class activation mapping;
D O I
10.1007/s11063-023-11180-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A variety of saliency detection techniques have been proposed during the last two decades to determine important regions on the surface of 3D shapes in form of triangular meshes. However, most fail in predicting the regions where human eyes naturally fixate when observing and exploring an object. Taking inspiration from biological studies that enumerate a list of object characteristics revealed in human visual processing and the influence of semantic properties in the emergence of neural responses in human brain, in this work, we propose a deep convolutional neural network architecture using gradient-based class activation mapping to detect saliencies on the surface of 3D objects when classifying them based on their different properties. We further argue that using Pearson Correlation Coefficient is not sufficient for the evaluation of saliency values and therefore propose a novel evaluation technique to determine how reliable is the detection performed by saliency detectors to predict eye fixations. More specifically, this evaluation metric measures the distance between the most salient region detected and the respective location of human eye fixation. Evaluating the results based on visual comparison, as well as using the proposed evaluation technique, demonstrates that our model is successful in predicting the locations where human eye fixates. Results are compared with five state-of-the-art saliency detectors, and our experiments suggest that in average the location of the highest saliency detected by our approach is closer to the location of human eye fixation by about 22.55% to 77.76% in comparison with five state-of-the art method" on a public dataset.
引用
收藏
页码:8847 / 8867
页数:21
相关论文
共 50 条
  • [31] 3D Visual Saliency: An Independent Perceptual Measure or a Derivative of 2D Image Saliency?
    Song, Ran
    Zhang, Wei
    Zhao, Yitian
    Liu, Yonghuai
    Rosin, Paul L.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) : 13083 - 13099
  • [32] Efficient Visual Saliency detection with Deep Learning
    Perillo, Erik
    Colombini, Esther
    2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 1676 - 1680
  • [33] Using Deep Learning for Visual Navigation of Drone with Respect to 3D Ground Objects
    Kupervasser, Oleg
    Kutomanov, Hennadii
    Levi, Ori
    Pukshansky, Vladislav
    Yavich, Roman
    MATHEMATICS, 2020, 8 (12) : 1 - 13
  • [34] Calibrated deep attention model for 3D pose estimation in the wild
    Jiang, Longkui
    Wang, Yuru
    Ji, Xinhe
    ELECTRONIC RESEARCH ARCHIVE, 2023, 31 (03): : 1556 - 1569
  • [35] Visual-Patch-Attention-Aware Saliency Detection
    Jian, Muwei
    Lam, Kin-Man
    Dong, Junyu
    Shen, Linlin
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (08) : 1575 - 1586
  • [36] Spatial Attention Frustum: A 3D Object Detection Method Focusing on Occluded Objects
    He, Xinglei
    Zhang, Xiaohan
    Wang, Yichun
    Ji, Hongzeng
    Duan, Xiuhui
    Guo, Fen
    SENSORS, 2022, 22 (06)
  • [37] Visual Attention for Rendered 3D Shapes
    Lavoue, Guillaume
    Cordier, Frederic
    Seo, Hyewon
    Larabi, Mohamed-Chaker
    COMPUTER GRAPHICS FORUM, 2018, 37 (02) : 191 - 203
  • [38] 3D Estimation of Visual Focus of Attention
    Antunes Simoes, Carlos Miguel
    Moreno, Plinio
    2022 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC), 2022, : 211 - 217
  • [39] A learning-based visual saliency prediction model for stereoscopic 3D video (LBVS-3D)
    Banitalebi-Dehkordi, Amin
    Pourazad, Mahsa T.
    Nasiopoulos, Panos
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (22) : 23859 - 23890
  • [40] A learning-based visual saliency prediction model for stereoscopic 3D video (LBVS-3D)
    Amin Banitalebi-Dehkordi
    Mahsa T. Pourazad
    Panos Nasiopoulos
    Multimedia Tools and Applications, 2017, 76 : 23859 - 23890