This paper proposes a cross-modal pedestrian re-recognition technique based on the balance of attention and strategy of multi-scale features. The technique improves recognition accuracy by integrating information from different scales, dynamically adjusting attention, and balancing contributions from different modalities. The model architecture includes a multi-scale feature extraction module, an attention mechanism, a strategy balancing mechanism, and a classifier. Experimental results show that the proposed model exhibits superior performance on several public datasets such as Market-1501, DukeMTMC-reID, and CUHK03, especially on the Market-1501 dataset, where MAP and Rank-1 reach 0.83 and 0.89, respectively, which outperforms the existing baseline model and other methods. In addition, by integrating RGB and Thermal modal information, the model's recognition ability is further improved, showing the effectiveness of cross-modal information integration.