A Multi-Scale Learnable Feature Alignment Network for Video Object Detection

被引：0

作者：

Wang, Rui ^{[1
]}

机构：

[1] Beijing Univ Technol, Comp Coll, Beijing, Peoples R China

来源：

2024 IEEE 21ST INTERNATIONAL CONFERENCE ON MOBILE AD-HOC AND SMART SYSTEMS, MASS 2024 | 2024年

关键词：

Object detection; Deep convolutional neural network (DCNN); Feature propagation; Feature fusion;

D O I：

10.1109/MASS62177.2024.00078

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detection is an important task of computer vision used to detect instances of visual objects of a certain class in digital images. Video object detection aims to locate single or multiple objects in sequential images and assign category labels for them. There are similarities between video object detection and image object detection, so some image object detection methods are usually used for video object detection. However, due to motion blur, occlusion, morphological diversity, and illumination changes in video, video object detection algorithms have higher requirements. In the framework of video object detection based on feature reuse and recursive fusion, we propose a multi-scale learnable sampling alignment (MLFA) network for video object detection. MLFA divides the video frame into the key frame and non-key frame and propagates a memory feature containing historical key frame information in the time dimension to compensate for the current frame feature through feature fusion. In the process of alignment, the feature pyramid is first established, and then the alignment features of different levels are learned in a learnable way. After that, features from different levels are fused to leverage multi-scale information. MLFA maintains the efficiency and further improves the detection accuracy.

引用

页码：496 / 501

页数：6

共 50 条

[41] StairsNet: Mixed Multi-scale Network for Object Detection
Gao, Weiyi
Cao, Wenlong
Zhai, Jian
Rui, Jianwu
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I, 2018, 10735 : 303 - 314
[42] Multi-scale Interactive Network for Salient Object Detection
Pang, Youwei
Zhao, Xiaoqi
Zhang, Lihe
Lu, Huchuan
arXiv, 2020,
[43] Multi-scale semantic enhancement network for object detection
Dongen Guo
Zechen Wu
Jiangfan Feng
Tao Zou
Scientific Reports, 13
[44] Multi-Scale Cascade Network for Salient Object Detection
Li, Xin
Yang, Fan
Cheng, Hong
Chen, Junyu
Guo, Yuxiao
Chen, Leiting
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 439 - 447
[45] Lightweight multi-scale network for small object detection
Li, Li
Li, Bingxue
Zhou, Hongjuan
PEERJ COMPUTER SCIENCE, 2022, 8
[46] Lightweight multi-scale network for small object detection
Li L.
Li B.
Zhou H.
PeerJ Computer Science, 2022, 8
[47] A Multi-Feature Fusion and Attention Network for Multi-Scale Object Detection in Remote Sensing Images
Cheng, Yong
Wang, Wei
Zhang, Wenjie
Yang, Ling
Wang, Jun
Ni, Huan
Guan, Tingzhao
He, Jiaxin
Gu, Yakang
Tran, Ngoc Nguyen
REMOTE SENSING, 2023, 15 (08)
[48] MFDANet: Multi-Scale Feature Dual-Stream Aggregation Network for Salient Object Detection
Ge, Bin
Pei, Jiajia
Xia, Chenxing
Wu, Taolin
ELECTRONICS, 2023, 12 (13)
[49] A Novel Multi-Scale Feature Fusion Method for Region Proposal Network in Fast Object Detection
Liu, Gang
Wang, Chuyi
INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2020, 16 (03) : 132 - 145
[50] NLFFTNet: A non-local feature fusion transformer network for multi-scale object detection
Zeng, Kai
Ma, Qian
Wu, Jiawen
Xiang, Sijia
Shen, Tao
Zhang, Lei
NEUROCOMPUTING, 2022, 493 : 15 - 27

← 1 2 3 4 5 →