MLSA-YOLO: a multi-level feature fusion and scale-adaptive framework for small object detection

被引:0
|
作者
Peng, Jiayu [1 ]
Lv, Kai [2 ]
Wang, Guoliang [2 ]
Xiao, Wendong [2 ]
Ran, Teng [2 ]
Yuan, Liang [2 ]
机构
[1] Xinjiang Univ, Sch Software, Urumqi 830091, Peoples R China
[2] Xinjiang Univ, Sch Mech Engn, Urumqi 830017, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 04期
基金
中国国家自然科学基金;
关键词
YOLOv8; Small object detection; Multi-level feature fusion; Scale-adaptive;
D O I
10.1007/s11227-025-06961-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the limited target area occupied by small objects, certain feature extraction paradigms that are not well-suited for small objects can further exacerbate the loss of their already limited information. Additionally, inconsistencies between features at different levels in FPN can result in suboptimal feature fusion, hindering the accurate representation of multi-scale features. As a result, even high-performance detectors struggle to recognize small objects effectively. To resolve the above issues, we propose MLSA-YOLO, a small object detection algorithm based on multi-level feature fusion and scale-adaptive. Initially, we restructured the network architecture using SPD-Conv with the proposed Convolutional Space-to-Depth (CSPD) module to improve the network's capacity for capturing local spatial details in images and to ensure that information is preserved during the downsampling process. Furthermore, to address the challenges in feature fusion, we employed a three-layer PAFPN structure at the neck and combined it with the proposed multi-level Feature Fusion and Scale-Adaptive (MLSA) feature pyramid network. This method enhances the complementarity of multi-level information, while effectively filtering the conflicting information generated during the fusion phase. To improve the quality of feature extraction, we incorporated the designed DCN_C2f module into the neck network. This module can accurately capture foreground object features, while enhancing the network's adaptability to geometric deformations of objects. Experimental results show that our approach performs better than other state-of-the-art detection algorithms on the VisDrone2019, DOTA, and FocusTiny datasets. Compared to YOLOv8s, mAP50 improved by 9.5%, 3.4%, and 5.1%, respectively.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] PS-YOLO: a small object detector based on efficient convolution and multi-scale feature fusion
    Peng, Shifeng
    Fan, Xin
    Tian, Shengwei
    Yu, Long
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [32] Small Object Detection via Scale-Adaptive Label Assignment and Localization Uncertainty
    Qin, Hui
    Mei, Tiancan
    Wang, Yaru
    UNMANNED SYSTEMS, 2024,
  • [33] Attention-Based Multi-Level Feature Fusion for Object Detection in Remote Sensing Images
    Dong, Xiaohu
    Qin, Yao
    Gao, Yinghui
    Fu, Ruigang
    Liu, Songlin
    Ye, Yuanxin
    REMOTE SENSING, 2022, 14 (15)
  • [34] REVISITING MULTI-LEVEL FEATURE FUSION: A SIMPLE YET EFFECTIVE NETWORK FOR SALIENT OBJECT DETECTION
    Qiu, Yu
    Liu, Yun
    Ma, Xiaoxu
    Liu, Lei
    Gao, Hongcan
    Xu, Jing
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 4010 - 4014
  • [35] Small Object Detection Based on Bidirectional Feature Fusion and Multi-scale Distillation
    Wang, Lingyu
    Zhou, Zijie
    Shi, Guanqun
    Guo, Junkang
    Liu, Zhigang
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT II, 2024, 15017 : 200 - 214
  • [36] MsfNet: a novel small object detection based on multi-scale feature fusion
    Song, Ziying
    Wu, Peiliang
    Yang, Kuihe
    Zhang, Yu
    Liu, Yi
    2021 17TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2021), 2021, : 700 - 704
  • [37] MLFA: Toward Realistic Test Time Adaptive Object Detection by Multi-Level Feature Alignment
    Liu, Yabo
    Wang, Jinghua
    Huang, Chao
    Wu, Yiling
    Xu, Yong
    Cao, Xiaochun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5837 - 5848
  • [38] Multi-level temporal feature fusion with feature exchange strategy for multiple object tracking
    Ge, Yisu
    Ye, Wenjie
    Zhang, Guodao
    Lin, Mengying
    OPTOELECTRONICS LETTERS, 2024, 20 (08) : 505 - 512
  • [39] Multi-level temporal feature fusion with feature exchange strategy for multiple object tracking
    GE Yisu
    YE Wenjie
    ZHANG Guodao
    LIN Mengying
    Optoelectronics Letters, 2024, 20 (08) : 505 - 512
  • [40] Combining Semantics With Multi-level Feature Fusion for Pedestrian Detection
    Chu J.
    Shu W.
    Zhou Z.-B.
    Miao J.
    Leng L.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (01): : 282 - 291