A Multi-Scale Learnable Feature Alignment Network for Video Object Detection

被引:0
|
作者
Wang, Rui [1 ]
机构
[1] Beijing Univ Technol, Comp Coll, Beijing, Peoples R China
关键词
Object detection; Deep convolutional neural network (DCNN); Feature propagation; Feature fusion;
D O I
10.1109/MASS62177.2024.00078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection is an important task of computer vision used to detect instances of visual objects of a certain class in digital images. Video object detection aims to locate single or multiple objects in sequential images and assign category labels for them. There are similarities between video object detection and image object detection, so some image object detection methods are usually used for video object detection. However, due to motion blur, occlusion, morphological diversity, and illumination changes in video, video object detection algorithms have higher requirements. In the framework of video object detection based on feature reuse and recursive fusion, we propose a multi-scale learnable sampling alignment (MLFA) network for video object detection. MLFA divides the video frame into the key frame and non-key frame and propagates a memory feature containing historical key frame information in the time dimension to compensate for the current frame feature through feature fusion. In the process of alignment, the feature pyramid is first established, and then the alignment features of different levels are learned in a learnable way. After that, features from different levels are fused to leverage multi-scale information. MLFA maintains the efficiency and further improves the detection accuracy.
引用
收藏
页码:496 / 501
页数:6
相关论文
共 50 条
  • [31] Exploring Multi-scale Deep Feature Fusion for Object Detection
    Zhang, Quan
    Lai, Jianhuang
    Xie, Xiaohua
    Zhu, Junyong
    PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT IV, 2018, 11259 : 40 - 52
  • [32] Spatial Attention for Multi-Scale Feature Refinement for Object Detection
    Wang, Haoran
    Wang, Zexin
    Jia, Meixia
    Li, Aijin
    Feng, Tuo
    Zhang, Wenhua
    Jiao, Licheng
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 64 - 72
  • [33] Multi-Scale Feature Enhancement for Saliency Object Detection Algorithm
    Li, Su
    Wang, Rugang
    Zhou, Feng
    Wang, Yuanyuan
    Guo, Naihong
    IEEE ACCESS, 2023, 11 : 103511 - 103520
  • [34] Multi-Scale Feature Enhancement Method for Underwater Object Detection
    Li, Mengpan
    Liu, Wenhao
    Shao, Changbin
    Qin, Bin
    Tian, Ali
    Yu, Hualong
    SYMMETRY-BASEL, 2025, 17 (01):
  • [35] Multi-Scale Feature Fusion Enhancement for Underwater Object Detection
    Xiao, Zhanhao
    Li, Zhenpeng
    Li, Huihui
    Li, Mengting
    Liu, Xiaoyong
    Kong, Yinying
    SENSORS, 2024, 24 (22)
  • [36] Multi-scale Feature and Spatial Relation Inference for Object Detection
    Zhou, Tianyu
    Miao, Zhenjiang
    Wang, Jiaji
    IMAGE AND GRAPHICS, ICIG 2019, PT I, 2019, 11901 : 666 - 675
  • [37] FPDT: a multi-scale feature pyramidal object detection transformer
    Huang, Kailai
    Wen, Mi
    Wang, Chen
    Ling, Lina
    JOURNAL OF APPLIED REMOTE SENSING, 2023, 17 (02)
  • [38] Object Detection Model Based on Multi-Scale Feature Integration
    Liu Wanjun
    Feng, Wang
    Qu Haicheng
    LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (23)
  • [39] Multi-scale Context Enhancement Network for Object Detection
    Wang, Yanan
    Ma, Yingdong
    2022 2ND IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE (SEAI 2022), 2022, : 6 - 11
  • [40] Multi-scale semantic enhancement network for object detection
    Guo, Dongen
    Wu, Zechen
    Feng, Jiangfan
    Zou, Tao
    SCIENTIFIC REPORTS, 2023, 13 (01)