To address challenges multi-scale variability, category imbalance, and high background similarity in aluminum surface defect detection, this paper proposes a YOLO-PDC model. First, Partial Convolution (PConv) and Deformable ConvNetsv2 (DCNv2) replace traditional convolution in the ELAN and MaxPool modules of the YOLOv7 backbone. This configuration forms the PD-ME module, which mitigates the issue of non-uniform scale variations among different defect types in aluminum dataset. It also reduces computational redundancy and memory access, enabling efficient extraction of spatial features and improving inference speed. Next, a 3D attention module (SimAM) is incorporated into YOLOv7 detection head after two up-sample steps and within two MaxPool structures, creating the Sim-CM Attention Mechanism. This addition enhances detection accuracy without introducing additional parameters. Additionally, during training, the Focal loss function replaces CIoU loss function. Focal loss dynamically decreases the weight of easily distinguishable samples through a scaling factor, allowing the model to focus on hard-to-distinguish samples and addressing low detection accuracy caused by sample imbalance. Experimental results demonstrate that the proposed YOLO-PDC model achieves a high mean Average Precision (mAP) of 87.7% and a real-time detection speed of 114 frames per second. Compared to the original YOLOv7, mAP50 and mAP50:90 improve by 5.2% and 12.2%, respectively, while the number of parameters and computations decrease by 2.18 million and 22.2 billion, respectively. Furthermore, compared to the latest defect detection models DETR, Swin-T, and ConvNeXt-T, the mAP50 of YOLO-PDC is higher by 15.2%, 17.9%, 16.2%, respectively. YOLO-PDC also surpasses existing state-of-the-art detection methods in terms of detection accuracy.