BackgroundOsteoporosis is a complex condition that drives research into its causes, diagnosis, treatment, and prevention, significantly affecting patients and healthcare providers in various aspects of life. Research is exploring orthopantomogram (OPG) radiography for osteoporosis screening instead of bone mineral density (BMD) assessments. Although this method uses various indicators, manual analysis can be challenging. Machine learning and deep learning techniques have been developed to address this. This systematic review and meta-analysis is the first to evaluate the accuracy of deep learning models in predicting osteoporosis from OPG radiographs, providing evidence for their performance and clinical use.MethodsA literature search was conducted in MEDLINE, Scopus, and Web of Science up to February 10, 2025, using the keywords related to deep learning, osteoporosis, and panoramic radiography. We conducted title, abstract, and full-text screening based on inclusion/exclusion criteria. Meta-analysis was performed using a bivariate random-effects model to pool diagnostic accuracy measures, and subgroup analyses explored sources of heterogeneity.ResultsWe found 204 articles, removed 189 duplicates and irrelevant studies, assessed 15articles, and ultimately, seven studies were selected. The DL models showed AUC values of 66.8-99.8%, with sensitivity and specificity ranging from 59 to 97% and 64.9-100%, respectively. No significant differences in diagnostic accuracy were found among subgroups. AlexNet had the highest performance, achieving a sensitivity of 0.89 and a specificity of 0.99. Sensitivity analysis revealed that excluding outliers had little impact on the results. Deeks' funnel plot indicated no significant publication bias (P = 0.54).ConclusionsThis systematic review indicates that deep learning models for osteoporosis diagnosis achieved 80% sensitivity, 92% specificity, and 93% AUC. Models like AlexNet and ResNet demonstrate effectiveness. These findings suggest that DL models are promising for noninvasive early detection, but more extensive multicenter studies are necessary to validate their efficacy in at-risk groups.