MLSA-YOLO: a multi-level feature fusion and scale-adaptive framework for small object detection

被引:0
|
作者
Peng, Jiayu [1 ]
Lv, Kai [2 ]
Wang, Guoliang [2 ]
Xiao, Wendong [2 ]
Ran, Teng [2 ]
Yuan, Liang [2 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi 830091, Peoples R China
[2] Xinjiang Univ, Sch Mech Engn, Urumqi 830017, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 04期
基金
中国国家自然科学基金;
关键词
YOLOv8; Small object detection; Multi-level feature fusion; Scale-adaptive;
D O I
10.1007/s11227-025-06961-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the limited target area occupied by small objects, certain feature extraction paradigms that are not well-suited for small objects can further exacerbate the loss of their already limited information. Additionally, inconsistencies between features at different levels in FPN can result in suboptimal feature fusion, hindering the accurate representation of multi-scale features. As a result, even high-performance detectors struggle to recognize small objects effectively. To resolve the above issues, we propose MLSA-YOLO, a small object detection algorithm based on multi-level feature fusion and scale-adaptive. Initially, we restructured the network architecture using SPD-Conv with the proposed Convolutional Space-to-Depth (CSPD) module to improve the network's capacity for capturing local spatial details in images and to ensure that information is preserved during the downsampling process. Furthermore, to address the challenges in feature fusion, we employed a three-layer PAFPN structure at the neck and combined it with the proposed multi-level Feature Fusion and Scale-Adaptive (MLSA) feature pyramid network. This method enhances the complementarity of multi-level information, while effectively filtering the conflicting information generated during the fusion phase. To improve the quality of feature extraction, we incorporated the designed DCN_C2f module into the neck network. This module can accurately capture foreground object features, while enhancing the network's adaptability to geometric deformations of objects. Experimental results show that our approach performs better than other state-of-the-art detection algorithms on the VisDrone2019, DOTA, and FocusTiny datasets. Compared to YOLOv8s, mAP50 improved by 9.5%, 3.4%, and 5.1%, respectively.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] An Object Detection Method Combining Multi-Level Feature Fusion and Region Channel Attention
    Zhu, Ge
    Wei, Zizun
    Lin, Feng
    IEEE Access, 2021, 9 : 25101 - 25109
  • [22] Multi-level Feature Selection for Oriented Object Detection
    Jiang, Chen
    Jiang, Yefan
    Bian, Zhangxing
    Yang, Fan
    Xia, Siyu
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM), 2021, : 36 - 43
  • [23] Triplet Network with Multi-level Feature Fusion for Object Tracking
    Cao, Yang
    Wan, Bo
    Wang, Quan
    Cheng, Fei
    2020 JOINT 9TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2020 4TH INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2020,
  • [24] Enhancement and Fusion of Multi-Scale Feature Maps for Small Object Detection
    Xue, Zhijun
    Chen, Wenjie
    Li, Jing
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7212 - 7217
  • [25] Small Object Detection using Multi-scale Feature Fusion and Attention
    Liu, Baokai
    Du, Shiqiang
    Li, Jiacheng
    Wang, Jianhua
    Liu, Wenjie
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 7246 - 7251
  • [26] Adaptive Fusion of Multi-Scale YOLO for Pedestrian Detection
    Hsu, Wei-Yen
    Lin, Wen-Yen
    IEEE ACCESS, 2021, 9 : 110063 - 110073
  • [27] OD-YOLO: Robust Small Object Detection Model in Remote Sensing Image with a Novel Multi-Scale Feature Fusion
    Bu, Yangcheng
    Ye, Hairong
    Tie, Zhixin
    Chen, Yanbing
    Zhang, Dingming
    SENSORS, 2024, 24 (11)
  • [28] A Single Shot Framework with Multi-Scale Feature Fusion for Geospatial Object Detection
    Zhuang, Shuo
    Wang, Ping
    Jiang, Boran
    Wang, Gang
    Wang, Cong
    REMOTE SENSING, 2019, 11 (05)
  • [29] Multi-scale object detection in UAV images based on adaptive feature fusion
    Tan, Siqi
    Duan, Zhijian
    Pu, Longzhong
    PLOS ONE, 2024, 19 (03):
  • [30] A Multi-Level Feature Fusion Framework for Spoken Language Understanding
    Xu, Dexin
    Cai, Ziming
    Zhang, Hui
    Zhang, Qiming
    2024 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, ICMA 2024, 2024, : 924 - 929