This study proposes a large-scale distributed privacy-preserving machine learning algorithm based on federated learning. The algorithm allows participants to jointly train high-quality models without sharing original data to meet the challenges brought by increasingly stringent data privacy and security regulations. To verify the performance of the federated learning system in a real-world environment, we built a distributed experimental platform consisting of multiple physical servers and evaluated it using several publicly available datasets such as MNIST, Federated EMNIST, and Federated CIFAR10/100. The experimental results show that the accuracy of the federated learning system is 97.3%, which is slightly lower than the 98.2% of the centralized learning method, but this is an acceptable trade-off considering the advantages of the federated learning method in protecting data privacy. In addition, our system only slightly drops to about 96.8% after the introduction of malicious clients, which proves the robustness of the federated learning system. Specifically, we adopt differential privacy technology, set the privacy budget Ε=1.0, and add Gaussian noise to the model update to ensure that even if a malicious user accesses the model update, no sensitive information of any individual user can be inferred from it. The experimental conditions include but are not limited to: the communication protocol uses homomorphic encryption, the average communication volume per iteration is 150 MB, and the total communication volume is 30 GB; the average CPU utilization of the client is about 70%, and the GPU utilization is about 80%. These settings ensure the efficiency of the system's computing resources, and also reflect the balance between privacy protection and model performance. © 2025 Slovene Society Informatika. All rights reserved.