To address the building decarbonization crisis, the widespread adoption of rooftop photovoltaics (PV) has been agreed upon globally, with PV potential prediction being a crucial evaluation task. Current methods for predicting rooftop photovoltaic (PV) potential face significant shortcomings, as geospatial approaches struggle with precision at urban scales, historical time-series methods tend to overestimate potential, and urban studies often neglect spatial shading between buildings, thereby inflating predictions. This study addresses these issues by employing a Graph Convolutional Network - Long Short-Term Memory (GCN-LSTM) model to perform spatiotemporal predictions of urban rooftop PV potential, incorporating the spatial shading relationships between buildings to enhance prediction accuracy. The results show that, compared to traditional Long Short-Term Memory (LSTM) models, GCN-LSTM significantly improves prediction accuracy, reducing MAE by 21 %, MSE by 22 %, RMSE by 13 %, and MAPE by 12 %. This improvement is particularly evident in winter and summer, validating the interpretability of the GCN-LSTM model. Moreover, clustering analysis of the shading relationship graphs between urban buildings identified three primary types of graph clusters: moderately diverse mediumscale building shading graphs, simple small-scale building shading graphs, and complex large-scale building shading graphs. Factors such as the number of buildings, standard deviation of building heights, and standard deviation of roof slopes were found to collectively influence the complexity and shading intensity of these graphs, leading to variations in PV potential. Based on the findings of this study, it is evident that integrating deep learning models with engineering physics knowledge can substantially enhance the accuracy of urban rooftop PV potential predictions, providing suggestions and bases for the formulation and implementation of PV promotion policies.