With networks increasing in size and traffic bursting, Data Center Networks (DCNs), as the core infrastructure of High-Performance Computing (HPC), can require a high-performance, robust, and scalable load balancing method. However, existing research work has not yet met these design objectives well. In this paper, we design, analyze and evaluate a novel Adaptive Load Balancing based on Traffic Prediction (ALB-TP) for achieving these goals. ALB-TP uses Gate Recurrent Unit and Attention (GRU-Attention) model to dynamically predict the path congestion information of the whole network. Compared with the existing scheme of collecting congestion status information in a fixed time period, the proposed GRU-Attention model improves the timeliness and accuracy of congestion information collection. With global congestion awareness, ALB-TP, which forwards flows to the least congested path via the two-stage routing in the actual implementation, is more robust than existing congestion-agnostic schemes for the asymmetric topology. Additionally, ALB-TP adopts a distributed control structure to capture the congestion information of the entire network in parallel, which makes it more scalable than existing congestion-aware schemes for large-scale networks. Evaluations show that on the Fat-Tree topology, ALB-TP can effectively alleviate network congestion and balance flows on different paths. Compared to existing GRU and LSTM models, the proposed GRU-Attention model improves the accuracy of congestion information prediction by 28.2% on average. Simulation results show that the proposed ALB-TP scheme reduces the Flow Completion Time (FCT) by an average of 18.5% and also improves the throughput by an average of 31.6% compared to the existing schemes. Through theoretical design and experimental analysis, we can see that the proposed ALB-TP can effectively balance the traffic load on the asymmetric topology and achieve the design goal of load balancing. Compared with existing schemes, ALB-TP also has better performance advantages in terms of FCT, throughput, and accuracy of congestion information collection.