In order to achieve accurate and efficient recognition of calf behavior in complex scenes such as cow overlapping, occlusion, and different light and occlusion levels, this experiment adopts the method of improving the YOLO v8 model to recognize calf behavior. A calf daily behavior dataset containing 2918 images is selected as the test benchmark through video frame extraction; a P2 small-target detection layer is introduced to improve the resolution of the input scene, which significantly improves the model recognition accuracy, and reduces the computational complexity and storage requirements of the model through the Lamp pruning method. Comparisons are made with the SSD, YOLOv5n, YOLOv8n, YOLOv8-C2f-faster-EMA, YOLO v11n, YOLO v12n, and YOLO v8-P2 advanced models. The results show that the number of parameters, floating point operations (FLOPs), model size, and mean average precision (mAP) of the model after introducing the P2 small-target detection layer and pruning with the Lamp strategy are 0.949 M, 4.0 G, 2.3 Mb, and 90.9%, respectively. The significant improvement in each index effectively reduces the model size and improves the accuracy of the network. The detection results in complex environments with different light and shading levels show that the mAP in daytime (exposure) and nighttime environments is 85.1% and 84.8%, respectively, and the average mAP in the three kinds of shading cases (light, medium, and heavy) is 87.3%, representing a lightweight, high-precision, real-time, and robust model. The results of this study provide a reference for the real-time monitoring of calf behaviors all day long in complex environments.