Multi-Scale Feature Fusion and Context-Enhanced Spatial Sparse Convolution Single-Shot Detector for Unmanned Aerial Vehicle Image Object Detection

被引：0

作者：

Qi, Guimei ^{[1
,2
]}

Yu, Zhihong ^{[2
]}

Song, Jian ^{[2
]}

机构：

[1] Inner Mongolia Normal Univ, Coll Comp Sci & Technol, Hohhot 010022, Peoples R China

[2] Inner Mongolia Agr Univ, Coll Mech & Elect Engn, Hohhot 010010, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 02期

基金：

中国国家自然科学基金;

关键词：

UAV image object detection; SSD; multi-scale feature fusion; context-enhanced spatial sparse convolution;

D O I：

10.3390/app15020924

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Accurate and efficient object detection in UAV images is a challenging task due to the diversity of target scales and the massive number of small targets. This study investigates the enhancement in the detection head using sparse convolution, demonstrating its effectiveness in achieving an optimal balance between accuracy and efficiency. Nevertheless, the sparse convolution method encounters challenges related to the inadequate incorporation of global contextual information and exhibits network inflexibility attributable to its fixed mask ratios. To address the above issues, the MFFCESSC-SSD, a novel single-shot detector (SSD) with multi-scale feature fusion and context-enhanced spatial sparse convolution, is proposed in this paper. First, a global context-enhanced group normalization (CE-GN) layer is developed to address the issue of information loss resulting from the convolution process applied exclusively to the masked region. Subsequently, a dynamic masking strategy is designed to determine the optimal mask ratios, thereby ensuring compact foreground coverage that enhances both accuracy and efficiency. Experiments on two datasets (i.e., VisDrone and ARH2000; the latter dataset was created by the researchers) demonstrate that the MFFCESSC-SSD remarkably outperforms the performance of the SSD and numerous conventional object detection algorithms in terms of accuracy and efficiency.

引用

页数：13

共 46 条

[41] Improving vehicle detection accuracy in complex traffic scenes through context attention and multi-scale feature fusion module
Liu, Wenbo
Zhao, Binglin
Zhu, Yuxin
Deng, Tao
Yan, Fei
APPLIED INTELLIGENCE, 2025, 55 (06)
[42] Anchor-Free Object Detection Method in Remote Sensing Image via Adaptive Multi-Scale Feature Fusion
Kun W.
Wu W.
Juhong T.
Xi W.
Ying F.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (09): : 1405 - 1416
[43] OD-YOLO: Robust Small Object Detection Model in Remote Sensing Image with a Novel Multi-Scale Feature Fusion
Bu, Yangcheng
Ye, Hairong
Tie, Zhixin
Chen, Yanbing
Zhang, Dingming
SENSORS, 2024, 24 (11)
[44] ABYOLOv4: improved YOLOv4 human object detection based on enhanced multi-scale feature fusion
Li, Rui
Zeng, Xin
Yang, Shiqiang
Li, Qi
Yan, An
Li, Dexin
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2024, 2024 (01)
[45] ABYOLOv4: improved YOLOv4 human object detection based on enhanced multi-scale feature fusion
Rui Li
Xin Zeng
Shiqiang Yang
Qi Li
An Yan
Dexin Li
EURASIP Journal on Advances in Signal Processing, 2024
[46] AMFEF-DETR: An End-to-End Adaptive Multi-Scale Feature Extraction and Fusion Object Detection Network Based on UAV Aerial Images
Wang, Sen
Jiang, Huiping
Yang, Jixiang
Ma, Xuan
Chen, Jiamin
DRONES, 2024, 8 (10)

← 1 2 3 4 5 →