Object semantic-guided graph attention feature fusion network for Siamese visual tracking

被引:3
|
作者
Zhang, Jianwei [1 ]
Miao, Mengen [1 ]
Zhang, Huanlong [2 ]
Wang, Jingchao [1 ]
Zhao, Yanchun [3 ]
Chen, Zhiwu [2 ]
Qiao, Jianwei [4 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Software Engn, Zhengzhou 450001, Peoples R China
[2] Zhengzhou Univ Light Ind, Coll Elect & Informat Engn, Zhengzhou 450002, Peoples R China
[3] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Huzhou, Huzhou 313001, Peoples R China
[4] Wolong Elect Nanyang Explos Proof Motor Grp, Nanyang 473000, Peoples R China
基金
中国国家自然科学基金;
关键词
Visual tracking; Siamese network; Semantic; -guided; Graph attention; ROBUST;
D O I
10.1016/j.jvcir.2022.103705
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The similarity matching between the template and the search area plays a key role in Siamese-based trackers. Most Siamese-based trackers adopt correlation operation to perform feature fusion on the template branch and search branch for similarity matching. However, the correlation operation directly uses the template feature to slide the window on the search area feature without distinguishing the discriminant part of the target and the background noise, which blurs the spatial information of the response feature. To address this issue, this work proposes a novel object semantic-guided graph attention feature fusion network that both removes background information and focuses on the discriminative part of the object. The proposed network effectively removes background noise by utilizing an adaptive template instead of the fixed-size template used by the correlation operation. The network also models the contextual semantic relations of the target and uses the resulting se-mantic relations to guide the feature fusion process in a part-based manner, thereby accurately highlighting the discriminative parts of the target. Therefore, the problem of blurring response feature caused by correlation operation is effectively resolved. Furthermore, we propose an object-aware prediction network to learn object -aware features for classification and regression task, which effectively improves the discriminative ability of the prediction network. Experiments on many challenging benchmarks like OTB-100, LaSOT, TColor-128, GOT -10k and VOT2019, show that our methods achieves excellent performance.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Siamese Guided Anchoring Network for Visual Tracking
    Zhou, Yifei
    Li, Jing
    Chang, Jun
    Xiao, Yafu
    Wan, Jun
    Sun, Hang
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [42] Visual Tracking Combining Attention and Feature Fusion Network Modulation
    Xu Keying
    Shu Ping
    Bao Hua
    LASER & OPTOELECTRONICS PROGRESS, 2022, 59 (12)
  • [43] Attention guided contextual feature fusion network for salient object detection
    Zhang, Jin
    Shi, Yanjiao
    Zhang, Qing
    Cui, Liu
    Chen, Ying
    Yi, Yugen
    IMAGE AND VISION COMPUTING, 2022, 117
  • [44] Multi-feature fusion Siamese Network for Real-Time Object Tracking
    Zhou, Lijun
    Li, Hongyun
    Zhang, Jianlin
    PROCEEDINGS OF 2018 THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (CSAI 2018) / 2018 THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY (ICIMT 2018), 2018, : 478 - 481
  • [45] Multi-level Cross-attention Siamese Network For Visual Object Tracking
    Zhang, Jianwei
    Wang, Jingchao
    Zhang, Huanlong
    Miao, Mengen
    Cai, Zengyu
    Chen, Fuguo
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (12): : 3976 - 3990
  • [46] SGFNet: Semantic-Guided Fusion Network for RGB-Thermal Semantic Segmentation
    WangLi, Yike
    Li, Gongyang
    Liu, Zhi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7737 - 7748
  • [47] Semantic Embedding Guided Attention with Explicit Visual Feature Fusion for Video Captioning
    Dong, Shanshan
    Niu, Tianzi
    Luo, Xin
    Liu, Wu
    Xu, Xinshun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (02)
  • [48] Fine-grained and Semantic-guided Visual Attention for Image Captioning
    Zhang, Zongjian
    Wu, Qiang
    Wang, Yang
    Chen, Fang
    2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, : 1709 - 1717
  • [49] ASMGCN: Attention-Based Semantic-Guided Multistream Graph Convolution Network for Skeleton Action Recognition
    Zhang, Moyan
    Quan, Zhenzhen
    Wang, Wei
    Chen, Zhe
    Guo, Xiaoshan
    Li, Yujun
    IEEE SENSORS JOURNAL, 2024, 24 (12) : 20064 - 20075
  • [50] Staged Depthwise Correlation and Feature Fusion for Siamese Object Tracking
    Ma, Dianbo
    Xiao, Jianqiang
    Gao, Ziyan
    Yamane, Satoshi
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,