Application of machine learning methods for predicting infant mortality in Rwanda: analysis of Rwanda demographic health survey 2014-15 dataset

被引:19
|
作者
Mfateneza, Emmanuel [1 ]
Rutayisire, Pierre Claver [2 ]
Biracyaza, Emmanuel [3 ]
Musafiri, Sanctus [4 ]
Mpabuka, Willy Gasafari [5 ]
机构
[1] Univ Rwanda, African Ctr Excellence Data Sci, Kigali, Rwanda
[2] Univ Rwanda, Appl Stat Dept, Kigali, Rwanda
[3] Prison Fellowship Rwanda, Kigali, Rwanda
[4] Univ Rwanda, Clin Dept Internal Med, Kigali, Rwanda
[5] Transparency Int Rwanda, Kigali, Rwanda
关键词
Infant mortality; Machine Learning; Logistic regression; Model accuracy; MODEL;
D O I
10.1186/s12884-022-04699-8
中图分类号
R71 [妇产科学];
学科分类号
100211 ;
摘要
Background Extensive research on infant mortality (IM) exists in developing countries; however, most of the methods applied thus far relied on conventional regression analyses with limited prediction capability. Advanced of Machine Learning (AML) methods provide accurate prediction of IM; however, there is no study conducted using ML methods in Rwanda. This study, therefore, applied Machine Learning Methods for predicting infant mortality in Rwanda. Methods A cross-sectional study design was conducted using the 2014-15 Rwanda Demographic and Health Survey. Python software version 3.8 was employed to test and apply ML methods through Random Forest (RF), Decision Tree, Support Vector Machine and Logistic regression. STATA version 13 was used for analysing conventional methods. Evaluation metrics methods specifically confusion matrix, accuracy, precision, recall, F1 score, and Area under the Receiver Operating Characteristics (AUROC) were used to evaluate the performance of predictive models. Results Ability of prediction was between 68.6% and 61.5% for AML. We preferred with the RF model (61.5%) presenting the best performance. The RF model was the best predictive model of IM with accuracy (84.3%), recall (91.3%), precision (80.3%), F1 score (85.5%), and AUROC (84.2%); followed by decision tree model with model accuracy (83%), recall (91%), precision (79%), F1 score (84.67%) and AUROC(82.9%), followed by support vector machine with model accuracy (68.6%), recall (74.9%), precision(67%), F1 score (70.73%) and AUROC (68.6%) and last was a logistic regression with the low accuracy of prediction (61.5%), recall (61.1%), precision (62.2%), F1 score (61.6%) and AUROC (61.5%) compared to other predictive models. Our predictive models showed that marital status, children ever born, birth order and wealth index are the 4 top predictors of IM. Conclusions In developing a predictive model, ML methods are used to classify certain hidden information that could not be detected by traditional statistical methods. Random Forest was classified as the best classifier to be used for the predictive models of IM.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Identifying risk factors of anemia among women of reproductive age in Rwanda – a cross-sectional study using secondary data from the Rwanda demographic and health survey 2014/2015
    Dieudonne Hakizimana
    Marie Paul Nisingizwe
    Jenae Logan
    Rex Wong
    BMC Public Health, 19
  • [22] Identifying risk factors of anemia among women of reproductive age in Rwanda - a cross-sectional study using secondary data from the Rwanda demographic and health survey 2014/2015
    Hakizimana, Dieudonne
    Nisingizwe, Marie Paul
    Logan, Jenae
    Wong, Rex
    BMC PUBLIC HEALTH, 2019, 19 (01)
  • [23] How can vulnerable populations in Rwanda and Burundi be reached with HIV information? Analysis of demographic health survey reports
    Hadley, Mary
    Ngaborano, Amadeus
    TROPICAL MEDICINE & INTERNATIONAL HEALTH, 2023, 28 : 129 - 129
  • [24] Prevalence and factors associated with condom use among women aged 15-49 years in Rwanda using a survey logistic regression model: evidence from the 2019/20 Rwanda Demographic and Health Survey
    Sithole, Mkhombiseni Zamani
    Batidzirai, Jesca Mercy
    Yirga, Ashenafi Argaw
    Musekiwa, Alfred
    PAN AFRICAN MEDICAL JOURNAL, 2023, 46 : 121
  • [25] Comparative Analysis of Machine Learning Models for Predicting Rice Yield: Insights from Agricultural Inputs and Practices in Rwanda
    Mugemangango, Cyprien
    Nzabanita, Joseph
    Muhoza, Dieudonne Ndaruhuye
    Cahill, Nathan D.
    RESEARCH ON WORLD AGRICULTURAL ECONOMY, 2024, 5 (04): : 350 - 366
  • [26] Fertility and HIV following universal access to ART in Rwanda: a cross-sectional analysis of Demographic and Health Survey data
    Remera, Eric
    Boer, Kimberly
    Umuhoza, Stella M.
    Hedt-Gauthier, Bethany L.
    Thomson, Dana R.
    Ndimubanzi, Patrick
    Kayirangwa, Eugenie
    Mutsinzi, Salomon
    Bayingana, Alice
    Mugwaneza, Placidie
    Koama, Jean Baptiste T.
    REPRODUCTIVE HEALTH, 2017, 14
  • [27] Risk factors associated with underweight status in children under five: An analysis of the 2010 Rwanda Demographic Health Survey (RDHS)
    Mukabutera A.
    Thomson D.R.
    Hedt-Gauthier B.L.
    Basinga P.
    Nyirazinyoye L.
    Murray M.
    BMC Nutrition, 2 (1)
  • [28] Fertility and HIV following universal access to ART in Rwanda: a cross-sectional analysis of Demographic and Health Survey data
    Eric Remera
    Kimberly Boer
    Stella M. Umuhoza
    Bethany L. Hedt-Gauthier
    Dana R. Thomson
    Patrick Ndimubanzi
    Eugenie Kayirangwa
    Salomon Mutsinzi
    Alice Bayingana
    Placidie Mugwaneza
    Jean Baptiste T. Koama
    Reproductive Health, 14
  • [29] Modeling the Relationship of Groundwater Salinity to Neonatal and Infant Mortality From the Bangladesh Demographic Health Survey 2000 to 2014
    Naser, Abu Mohd
    Wang, Qiao
    Shamsudduha, Mohammad
    Chellaraj, Gnanaraj
    Joseph, George
    GEOHEALTH, 2020, 4 (02):
  • [30] Machine learning approach for predicting under-five mortality determinants in Ethiopia: evidence from the 2016 Ethiopian Demographic and Health Survey
    Fikrewold H. Bitew
    Samuel H. Nyarko
    Lloyd Potter
    Corey S. Sparks
    Genus, 76