Visual Saliency via Selecting and Reweighting Features in Hierarchical Fusion Network

被引:0
|
作者
Zhou, Fei [1 ,2 ,3 ,4 ,5 ]
Chen, Junhua [1 ,2 ,3 ,4 ,5 ]
Liu, Bozhi [1 ,2 ,3 ,4 ,5 ]
机构
[1] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen 518060, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518060, Peoples R China
[3] Guangdong Key Lab Intelligent Informat Proc, Shenzhen 518060, Peoples R China
[4] Shenzhen Univ, Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen 518060, Peoples R China
[5] Key Lab Digital Creat Technol, Shenzhen 518060, Peoples R China
关键词
Feature extraction; Visualization; Predictive models; Computer architecture; Computational modeling; Task analysis; Biological system modeling; Feature selection and reweighting; hierarchical fusion network; saliency prediction; NEURAL-NETWORK; PREDICTION; ATTENTION; MODEL;
D O I
10.1109/LSP.2021.3104757
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, computational models based on deep neural networks have made impressive progress in predicting the visual saliency of human beings. Relying on the powerful capability of some pre-trained networks, various models can extract diverse deep features. However, they are unaware of the problem of feature selection and reweighting when predicting saliency. This situation gives rise to features describing scene distractors potentially also contributing to the saliency maps. In this paper, we propose a feature selection and reweighting module (FSRM) for deep saliency prediction models. Through the FSRM, we wish to highlight the saliency-related features in a manner similar to channel attention and simultaneously exclude distractor features by reducing the channel number of deep features. Specifically, in the FSRM, we obtain an importance descriptor of feature channels, where some saliency knowledge including the center prior and rarity is encoded. Furthermore, the number of feature channels is reduced via a transformation matrix derived from the importance descriptor. To predict the saliency, the FSRM is embedded in a hierarchical fusion network that makes use of multi-level features. Experiments and ablation studies show the effectiveness and generalization capability of the FSRM in the saliency prediction.
引用
收藏
页码:1749 / 1753
页数:5
相关论文
共 50 条
  • [31] Siamese Network Based Features Fusion for Adaptive Visual Tracking
    Guo, Dongyan
    Zhao, Weixuan
    Cui, Ying
    Wang, Zhenhua
    Chen, Shengyong
    Zhang, Jian
    PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2018, 11012 : 759 - 771
  • [32] Saliency Detection Via Background Features
    Jiang, Wei
    Dai, Houde
    Zeng, Yadan
    Lin, Mingqiang
    TENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2018), 2018, 10806
  • [33] Heterogeneous Visual Features Fusion via Sparse Multimodal Machine
    Wang, Hua
    Nie, Feiping
    Huang, Heng
    Ding, Chris
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 3097 - 3102
  • [34] Deep CNN Features for Visual Saliency Estimation
    Azaza, Aymen
    Douik, Ali
    2018 15TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS AND DEVICES (SSD), 2018, : 688 - 692
  • [35] Visual Saliency Based on MuItiscale Deep Features
    Li, Guanbin
    Yu, Yizhou
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 5455 - 5463
  • [36] Adams-based hierarchical features fusion network for image dehazing
    Yin, Shibai
    Hu, Shuhao
    Wang, Yibin
    Wang, Weixing
    Yang, Yee-Hong
    NEURAL NETWORKS, 2023, 163 : 379 - 394
  • [37] Target recognition of SAR images via hierarchical fusion of complementary features
    Feng, Bo
    Tang, Wei
    Feng, Daoyan
    OPTIK, 2020, 217
  • [38] SELECTING RELEVANT VISUAL FEATURES FOR SPEECHREADING
    Estellers, V.
    Gurban, M.
    Thiran, J. P.
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 1433 - 1436
  • [39] Visual features extracting & selecting for lipreading
    Yao, HX
    Gao, W
    Shan, W
    Xu, MH
    AUDIO-AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2003, 2688 : 251 - 259
  • [40] Two-scale fusion method of infrared and visible images via parallel saliency features
    Duan, Chaowei
    Xing, Changda
    Lu, Shanshan
    Wang, Zhisheng
    IET IMAGE PROCESSING, 2020, 14 (16) : 4412 - 4423