MLSA-YOLO: a multi-level feature fusion and scale-adaptive framework for small object detection

被引:0
|
作者
Peng, Jiayu [1 ]
Lv, Kai [2 ]
Wang, Guoliang [2 ]
Xiao, Wendong [2 ]
Ran, Teng [2 ]
Yuan, Liang [2 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi 830091, Peoples R China
[2] Xinjiang Univ, Sch Mech Engn, Urumqi 830017, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 04期
基金
中国国家自然科学基金;
关键词
YOLOv8; Small object detection; Multi-level feature fusion; Scale-adaptive;
D O I
10.1007/s11227-025-06961-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the limited target area occupied by small objects, certain feature extraction paradigms that are not well-suited for small objects can further exacerbate the loss of their already limited information. Additionally, inconsistencies between features at different levels in FPN can result in suboptimal feature fusion, hindering the accurate representation of multi-scale features. As a result, even high-performance detectors struggle to recognize small objects effectively. To resolve the above issues, we propose MLSA-YOLO, a small object detection algorithm based on multi-level feature fusion and scale-adaptive. Initially, we restructured the network architecture using SPD-Conv with the proposed Convolutional Space-to-Depth (CSPD) module to improve the network's capacity for capturing local spatial details in images and to ensure that information is preserved during the downsampling process. Furthermore, to address the challenges in feature fusion, we employed a three-layer PAFPN structure at the neck and combined it with the proposed multi-level Feature Fusion and Scale-Adaptive (MLSA) feature pyramid network. This method enhances the complementarity of multi-level information, while effectively filtering the conflicting information generated during the fusion phase. To improve the quality of feature extraction, we incorporated the designed DCN_C2f module into the neck network. This module can accurately capture foreground object features, while enhancing the network's adaptability to geometric deformations of objects. Experimental results show that our approach performs better than other state-of-the-art detection algorithms on the VisDrone2019, DOTA, and FocusTiny datasets. Compared to YOLOv8s, mAP50 improved by 9.5%, 3.4%, and 5.1%, respectively.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] SMFF-YOLO: A Scale-Adaptive YOLO Algorithm with Multi-Level Feature Fusion for Object Detection in UAV Scenes
    Wang, Yuming
    Zou, Hua
    Yin, Ming
    Zhang, Xining
    REMOTE SENSING, 2023, 15 (18)
  • [2] AMFT-YOLO: A Adaptive Multi-scale YOLO Algorithm with Multi-level Feature Fusion for Object Detection in UAV Scenes
    Wang, Tiebiao
    Cui, Zhenchao
    Li, Xiaoyang
    MULTIMEDIA MODELING, MMM 2025, PT I, 2025, 15520 : 72 - 85
  • [3] Salient Object Detection Based on Multi-scale Feature Extraction and Multi-level Feature Fusion
    Li, Lingli
    Meng, Lingbing
    Li, Jinbao
    Gongcheng Kexue Yu Jishu/Advanced Engineering Sciences, 2021, 53 (01): : 170 - 177
  • [4] Small-Scale Pedestrian Detection Based on Multi-level Feature Fusion
    Yan, Chaoqi
    Zhang, Hong
    Li, Xuliang
    Yang, Yifan
    Chen, Hao
    Yuan, Ding
    THIRTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2021), 2022, 12083
  • [5] Multi-level feature fusion pyramid network for object detection
    Guo, Zebin
    Shuai, Hui
    Liu, Guangcan
    Zhu, Yisheng
    Wang, Wenqing
    VISUAL COMPUTER, 2023, 39 (09): : 4267 - 4277
  • [6] Multi-level feature fusion pyramid network for object detection
    Zebin Guo
    Hui Shuai
    Guangcan Liu
    Yisheng Zhu
    Wenqing Wang
    The Visual Computer, 2023, 39 : 4267 - 4277
  • [7] Towards Accurate Oriented Object Detection in Aerial Images with Adaptive Multi-level Feature Fusion
    Zhen, Peining
    Wang, Shuqi
    Zhang, Suming
    Yan, Xiaotao
    Wang, Wei
    Ji, Zhigang
    Chen, Hai-Bao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (01)
  • [8] A multi-level feature weight fusion model for salient object detection
    Zhang, Shanqing
    Chen, Yujie
    Meng, Yiheng
    Lu, Jianfeng
    Li, Li
    Bai, Rui
    MULTIMEDIA SYSTEMS, 2023, 29 (03) : 887 - 895
  • [9] A multi-level feature weight fusion model for salient object detection
    Zhang Shanqing
    Chen Yujie
    Meng Yiheng
    Lu Jianfeng
    Li Li
    Bai Rui
    Multimedia Systems, 2023, 29 : 887 - 895
  • [10] Fast multi-feature pyramids for scale-adaptive object tracking
    Yang Y.
    Ku T.
    Zha Y.
    Zhang Y.
    Li H.
    Hsi An Chiao Tung Ta Hsueh, 10 (49-56): : 49 - 56