Due to the limited target area occupied by small objects, certain feature extraction paradigms that are not well-suited for small objects can further exacerbate the loss of their already limited information. Additionally, inconsistencies between features at different levels in FPN can result in suboptimal feature fusion, hindering the accurate representation of multi-scale features. As a result, even high-performance detectors struggle to recognize small objects effectively. To resolve the above issues, we propose MLSA-YOLO, a small object detection algorithm based on multi-level feature fusion and scale-adaptive. Initially, we restructured the network architecture using SPD-Conv with the proposed Convolutional Space-to-Depth (CSPD) module to improve the network's capacity for capturing local spatial details in images and to ensure that information is preserved during the downsampling process. Furthermore, to address the challenges in feature fusion, we employed a three-layer PAFPN structure at the neck and combined it with the proposed multi-level Feature Fusion and Scale-Adaptive (MLSA) feature pyramid network. This method enhances the complementarity of multi-level information, while effectively filtering the conflicting information generated during the fusion phase. To improve the quality of feature extraction, we incorporated the designed DCN_C2f module into the neck network. This module can accurately capture foreground object features, while enhancing the network's adaptability to geometric deformations of objects. Experimental results show that our approach performs better than other state-of-the-art detection algorithms on the VisDrone2019, DOTA, and FocusTiny datasets. Compared to YOLOv8s, mAP50 improved by 9.5%, 3.4%, and 5.1%, respectively.