A Multi-Scale Learnable Feature Alignment Network for Video Object Detection

被引:0
|
作者
Wang, Rui [1 ]
机构
[1] Beijing Univ Technol, Comp Coll, Beijing, Peoples R China
关键词
Object detection; Deep convolutional neural network (DCNN); Feature propagation; Feature fusion;
D O I
10.1109/MASS62177.2024.00078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection is an important task of computer vision used to detect instances of visual objects of a certain class in digital images. Video object detection aims to locate single or multiple objects in sequential images and assign category labels for them. There are similarities between video object detection and image object detection, so some image object detection methods are usually used for video object detection. However, due to motion blur, occlusion, morphological diversity, and illumination changes in video, video object detection algorithms have higher requirements. In the framework of video object detection based on feature reuse and recursive fusion, we propose a multi-scale learnable sampling alignment (MLFA) network for video object detection. MLFA divides the video frame into the key frame and non-key frame and propagates a memory feature containing historical key frame information in the time dimension to compensate for the current frame feature through feature fusion. In the process of alignment, the feature pyramid is first established, and then the alignment features of different levels are learned in a learnable way. After that, features from different levels are fused to leverage multi-scale information. MLFA maintains the efficiency and further improves the detection accuracy.
引用
收藏
页码:496 / 501
页数:6
相关论文
共 50 条
  • [21] MFEFNet: A Multi-Scale Feature Information Extraction and Fusion Network for Multi-Scale Object Detection in UAV Aerial Images
    Zhou, Liming
    Zhao, Shuai
    Wan, Ziye
    Liu, Yang
    Wang, Yadi
    Zuo, Xianyu
    DRONES, 2024, 8 (05)
  • [22] Fast camouflaged object detection via multi-scale feature-enhanced network
    Bingqin Zhou
    Kun Yang
    Zhigang Gao
    Signal, Image and Video Processing, 2024, 18 : 3903 - 3914
  • [23] CONTEXT-AWARE HIERARCHICAL FEATURE ATTENTION NETWORK FOR MULTI-SCALE OBJECT DETECTION
    Xu, Xuelong
    Luo, Xiangfeng
    Ma, Liyan
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2011 - 2015
  • [24] Fast camouflaged object detection via multi-scale feature-enhanced network
    Zhou, Bingqin
    Yang, Kun
    Gao, Zhigang
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3903 - 3914
  • [25] Construct Effective Geometry Aware Feature Pyramid Network for Multi-Scale Object Detection
    Dong, Jinpeng
    Huang, Yuhao
    Zhang, Songyi
    Chen, Shitao
    Zheng, Nanning
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 534 - 541
  • [26] Multi-Scale Attention and Encoder-Decoder Network for Video Saliency Object Detection
    Bi, Hongbo
    Zhu, Huihui
    Yang, Lina
    Wu, Ranwan
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (02) : 340 - 350
  • [27] Multi-Scale Attention and Encoder-Decoder Network for Video Saliency Object Detection
    Hongbo Bi
    Huihui Zhu
    Lina Yang
    Ranwan Wu
    Pattern Recognition and Image Analysis, 2022, 32 : 340 - 350
  • [28] Hierarchical boundary feature alignment network for video salient object detection
    Mao, Amin
    Yan, Jiebin
    Fang, Yuming
    Liu, Hantao
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2025, 109
  • [29] Multi-scale Deep Feature Transfer for Automatic Video Object Segmentation
    Zhen Yang
    Qingxuan Shi
    Yichuan Fang
    Neural Processing Letters, 2023, 55 : 11701 - 11719
  • [30] Multi-scale Deep Feature Transfer for Automatic Video Object Segmentation
    Yang, Zhen
    Shi, Qingxuan
    Fang, Yichuan
    NEURAL PROCESSING LETTERS, 2023, 55 (08) : 11701 - 11719