Single Object Tracking in Satellite Videos Based on Feature Enhancement and Multi-Level Matching Strategy

被引:7
|
作者
Yang, Jianwei [1 ,2 ,3 ]
Pan, Zongxu [1 ,2 ,3 ]
Liu, Yuhan [1 ,2 ]
Niu, Ben [1 ,2 ,3 ]
Lei, Bin [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Aerosp Informat Res Inst, Beijing 100094, Peoples R China
[2] Chinese Acad Sci, Key Lab Technol Geospatial Informat Proc & Applica, Beijing 100190, Peoples R China
[3] Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 100049, Peoples R China
关键词
satellite video; object tracking; siamese network; feature enhancement; matching strategy; SIAMESE NETWORKS;
D O I
10.3390/rs15174351
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Despite significant advancements in remote sensing object tracking (RSOT) in recent years, achieving accurate and continuous tracking of tiny-sized targets remains a challenging task due to similar object interference and other related issues. In this paper, from the perspective of feature enhancement and a better feature matching strategy, we present a tracker SiamTM specifically designed for RSOT, which is mainly based on a new target information enhancement (TIE) module and a multi-level matching strategy. First, we propose a TIE module to address the challenge of tiny object sizes in satellite videos. The proposed TIE module goes along two spatial directions to capture orientation and position-aware information, respectively, while capturing inter-channel information at the global 2D image level. The TIE module enables the network to extract discriminative features of the targets more effectively from satellite images. Furthermore, we introduce a multi-level matching (MM) module that is better suited for satellite video targets. The MM module firstly embeds the target feature map after ROI Align into each position of the search region feature map to obtain a preliminary response map. Subsequently, the preliminary response map and the template region feature map are subjected to the Depth-wise Cross Correlation operation to get a more refined response map. Through this coarse-to-fine approach, the tracker obtains a response map with a more accurate position, which lays a good foundation for the prediction operation of the subsequent sub-networks. We conducted extensive experiments on two large satellite video single-object tracking datasets: SatSOT and SV248S. Without bells and whistles, the proposed tracker SiamTM achieved competitive results on both datasets while running at real-time speed.
引用
收藏
页数:25
相关论文
共 50 条
  • [21] Salient Object Detection Based on Multi-scale Feature Extraction and Multi-level Feature Fusion
    Li, Lingli
    Meng, Lingbing
    Li, Jinbao
    Gongcheng Kexue Yu Jishu/Advanced Engineering Sciences, 2021, 53 (01): : 170 - 177
  • [22] Feature refinement with multi-level context for object detection
    Yingdong Ma
    Yanan Wang
    Machine Vision and Applications, 2023, 34
  • [23] Feature refinement with multi-level context for object detection
    Ma, Yingdong
    Wang, Yanan
    MACHINE VISION AND APPLICATIONS, 2023, 34 (04)
  • [24] Multi-level Feature Selection for Oriented Object Detection
    Jiang, Chen
    Jiang, Yefan
    Bian, Zhangxing
    Yang, Fan
    Xia, Siyu
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM), 2021, : 36 - 43
  • [25] Object Tracking in Satellite Videos Based on Improved Correlation Filters
    Liu Yaosheng
    Liao Yurong
    Lin Cunbao
    Li Zhaoming
    Yang Xinyan
    Zhang Aidi
    2021 13TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2021), 2021, : 323 - 331
  • [26] A single-shot multi-level feature reused neural network for object detection
    Lixin Wei
    Wei Cui
    Ziyu Hu
    Hao Sun
    Shijie Hou
    The Visual Computer, 2021, 37 : 133 - 142
  • [27] A single-shot multi-level feature reused neural network for object detection
    Wei, Lixin
    Cui, Wei
    Hu, Ziyu
    Sun, Hao
    Hou, Shijie
    VISUAL COMPUTER, 2021, 37 (01): : 133 - 142
  • [28] Video object segmentation based on multi-level target models and feature integration
    Gao, Bocong
    Zhao, Yuqian
    Zhang, Fan
    Luo, Biao
    Yang, Chunhua
    Neurocomputing, 2022, 492 : 396 - 407
  • [29] Video object segmentation based on multi-level target models and feature integration
    Gao, Bocong
    Zhao, Yuqian
    Zhang, Fan
    Luo, Biao
    Yang, Chunhua
    NEUROCOMPUTING, 2022, 492 : 396 - 407
  • [30] Determining Wave Height from Nearshore Videos Based on Multi-level Spatiotemporal Feature Fusion
    Song, Wei
    Li, Qi-chao
    He, Qi
    Zhou, Xu
    Chen, Yuan-yuan
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,