Object detection in remote sensing images is of significant research value in fields such as environmental monitoring and urban planning. However, the large variation in object sizes, along with challenges such as small and densely packed objects, makes this task particularly challenging. To address these issues, we propose an algorithm for multi-scale feature extraction in remote sensing image detection using dilated residuals (ADS-YOLO). Firstly, to address the challenges of scale variation and small target size, the Dilation-wise Residual (DWR) design is employed to form the C2f_DWR module, which restructures the bottleneck structure within the C2f segment to facilitate the extraction and fusion of multi-scale contextual information, thus reducing the difficulty associated with target scale variation. Secondly, inspired by the Adown subsampling convolution module from YOLOv9, we use it to replace the convolutions in the Backbone, enabling the model to capture finer image details at higher levels, while maintaining accuracy and reducing computational load. Lastly, to address the issue of dense targets, we design the Soft-NMS-ShapeIoU module to improve the consistency of bounding boxes and target shapes, while also suppressing adjacent boxes. Experimental results demonstrate that, on the publicly available remote sensing image datasets DIOR, RSOD, and NWPU VHR-10, the proposed ADS-YOLO model outperforms other state-of-the-art methods by a significant margin.