MLSA-YOLO: a multi-level feature fusion and scale-adaptive framework for small object detection

被引：0

作者：

Peng, Jiayu ^{[1
]}

Lv, Kai ^{[2
]}

Wang, Guoliang ^{[2
]}

Xiao, Wendong ^{[2
]}

Ran, Teng ^{[2
]}

Yuan, Liang ^{[2
]}

机构：

[1] Xinjiang Univ, Sch Software, Urumqi 830091, Peoples R China

[2] Xinjiang Univ, Sch Mech Engn, Urumqi 830017, Peoples R China

来源：

JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 04期

基金：

中国国家自然科学基金;

关键词：

YOLOv8; Small object detection; Multi-level feature fusion; Scale-adaptive;

D O I：

10.1007/s11227-025-06961-0

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Due to the limited target area occupied by small objects, certain feature extraction paradigms that are not well-suited for small objects can further exacerbate the loss of their already limited information. Additionally, inconsistencies between features at different levels in FPN can result in suboptimal feature fusion, hindering the accurate representation of multi-scale features. As a result, even high-performance detectors struggle to recognize small objects effectively. To resolve the above issues, we propose MLSA-YOLO, a small object detection algorithm based on multi-level feature fusion and scale-adaptive. Initially, we restructured the network architecture using SPD-Conv with the proposed Convolutional Space-to-Depth (CSPD) module to improve the network's capacity for capturing local spatial details in images and to ensure that information is preserved during the downsampling process. Furthermore, to address the challenges in feature fusion, we employed a three-layer PAFPN structure at the neck and combined it with the proposed multi-level Feature Fusion and Scale-Adaptive (MLSA) feature pyramid network. This method enhances the complementarity of multi-level information, while effectively filtering the conflicting information generated during the fusion phase. To improve the quality of feature extraction, we incorporated the designed DCN_C2f module into the neck network. This module can accurately capture foreground object features, while enhancing the network's adaptability to geometric deformations of objects. Experimental results show that our approach performs better than other state-of-the-art detection algorithms on the VisDrone2019, DOTA, and FocusTiny datasets. Compared to YOLOv8s, mAP50 improved by 9.5%, 3.4%, and 5.1%, respectively.

引用

页数：24

共 50 条

[1] SMFF-YOLO: A Scale-Adaptive YOLO Algorithm with Multi-Level Feature Fusion for Object Detection in UAV Scenes
Wang, Yuming
Zou, Hua
Yin, Ming
Zhang, Xining
REMOTE SENSING, 2023, 15 (18)
[2] AMFT-YOLO: A Adaptive Multi-scale YOLO Algorithm with Multi-level Feature Fusion for Object Detection in UAV Scenes
Wang, Tiebiao
Cui, Zhenchao
Li, Xiaoyang
MULTIMEDIA MODELING, MMM 2025, PT I, 2025, 15520 : 72 - 85
[3] Salient Object Detection Based on Multi-scale Feature Extraction and Multi-level Feature Fusion
Li, Lingli
Meng, Lingbing
Li, Jinbao
Gongcheng Kexue Yu Jishu/Advanced Engineering Sciences, 2021, 53 (01): : 170 - 177
[4] Small-Scale Pedestrian Detection Based on Multi-level Feature Fusion
Yan, Chaoqi
Zhang, Hong
Li, Xuliang
Yang, Yifan
Chen, Hao
Yuan, Ding
THIRTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2021), 2022, 12083
[5] Multi-level feature fusion pyramid network for object detection
Guo, Zebin
Shuai, Hui
Liu, Guangcan
Zhu, Yisheng
Wang, Wenqing
VISUAL COMPUTER, 2023, 39 (09): : 4267 - 4277
[6] Multi-level feature fusion pyramid network for object detection
Zebin Guo
Hui Shuai
Guangcan Liu
Yisheng Zhu
Wenqing Wang
The Visual Computer, 2023, 39 : 4267 - 4277
[7] Towards Accurate Oriented Object Detection in Aerial Images with Adaptive Multi-level Feature Fusion
Zhen, Peining
Wang, Shuqi
Zhang, Suming
Yan, Xiaotao
Wang, Wei
Ji, Zhigang
Chen, Hai-Bao
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
[8] A multi-level feature weight fusion model for salient object detection
Zhang, Shanqing
Chen, Yujie
Meng, Yiheng
Lu, Jianfeng
Li, Li
Bai, Rui
MULTIMEDIA SYSTEMS, 2023, 29 (03) : 887 - 895
[9] A multi-level feature weight fusion model for salient object detection
Zhang Shanqing
Chen Yujie
Meng Yiheng
Lu Jianfeng
Li Li
Bai Rui
Multimedia Systems, 2023, 29 : 887 - 895
[10] Fast multi-feature pyramids for scale-adaptive object tracking
Yang Y.
Ku T.
Zha Y.
Zhang Y.
Li H.
Hsi An Chiao Tung Ta Hsueh, 10 (49-56): : 49 - 56

← 1 2 3 4 5 →