Classify Alzheimer genes association using Naive Bayes algorithm

被引:0
|
作者
Raj, Sushrutha [1 ]
Vishnoi, Anchal [2 ]
Srivastava, Alok [2 ,3 ]
机构
[1] Amity Univ Haryana, Amity Inst Integrat Sci & Hlth, Amity Educ Valley, Gurgaon 122413, India
[2] Sri Innovat & Res Fdn, Ghaziabad 201009, India
[3] L V Prasad Eye Inst, Hyderabad 500034, Telangana, India
来源
HUMAN GENE | 2024年 / 41卷
关键词
Disease gene associations; Alzheimer's candidate genes; Machine learning; Text mining; Text classification; Cross validation; TEXT-MINING SYSTEM; HUMAN-DISEASES; IDENTIFICATION; LINKAGE; DRUGS; PRIORITIZATION; GENOMICS; DATABASE; TARGETS;
D O I
10.1016/j.humgen.2024.201309
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Background: Alzheimer's disease, the most common form of dementia, accounts for 60-80% of cases and its prevalence is projected to increase as aging populations grow. By 2050, the number of individuals with Alzheimer's and dementia worldwide is expected to reach 152 million. Genetics plays a significant role, contributing to about 70% of the overall risk, underscoring the importance of understanding the genetic basis for developing targeted interventions. This study presents a system that combines text mining and machine learning techniques to identify and prioritize prospective candidate genes for Alzheimer's and further classifies them into three association classes with weights. Methods: The machine learning-based classifier was trained over a meticulously curated gold standard dataset and then rigorously validated utilizing a 10-fold cross-validation method, demonstrating its consistency across all the folds of the data. This developed ensemble learning system categorizes PubMed abstracts into three distinct groups: Yes, No, and Ambiguous using text mining and a Bayesian classification algorithm. The system further predicts disease-gene associations over unknown disease-specific prediction data by using the developed classifier. Results: With an average accuracy of 87.33% and confidence level of 90.10% +/- 0.142, the protocol effectively extracted 2031 associated genes, of which 1162, 489 and 1439 belong to positive, negative and ambiguous classes respectively at the threshold of 0.9. In comparison between the established disease gene databases, our system identified 915 positive genes that had not been previously reported. One can use these positive genes for in-depth understanding and ambiguous genes for further exploration of their association with Alzheimer's disease. Conclusions: The system's ability to generate accurate predictions demonstrates its robustness and provides valuable insights into the genetic factors of Alzheimer's disease. Consequently, this study contributes to existing knowledge and paves the way for future research in this field.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Using Naive Bayes Method to Classify Text-based Email
    Kang, LanLan
    Chen, Ruey-Shun
    Chen, Yeh-Cheng
    Cao, WenLiang
    2018 9TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP 2018), 2018, : 94 - 98
  • [2] Classify the Sentiments of email Contents using Novel Bidirectional Encoder Representation for Transformation over Naive Bayes Algorithm
    Chikkili, Hema Kumar
    Malathi, K.
    JOURNAL OF PHARMACEUTICAL NEGATIVE RESULTS, 2022, 13 : 319 - 328
  • [3] Classify the Sentiments of email Contents using Novel Bidirectional Encoder Representation for Transformation over Naive Bayes Algorithm
    Kumar, Chikkili Hema
    Malathi, K.
    JOURNAL OF PHARMACEUTICAL NEGATIVE RESULTS, 2022, 13 : 319 - 328
  • [4] A novel way to classify passenger data using Naive bayes algorithm (A Real Time Anti-Terrorism Approach)
    Singh, Saurabh
    Verma, Shashikant
    Tiwari, Akhilesh
    Tiwari, Aditya
    PROCEEDINGS ON 2016 2ND INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES (NGCT), 2016, : 312 - 316
  • [5] Prediction of Cardiac Disease using Naive Bayes Algorithm
    Ramesh, S. M.
    Sengottaiyan, N.
    Vanathi, D.
    Manoja, R.
    Tamizharasu, K.
    Kalyanasundara, P.
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 994 - 997
  • [6] Intrusion Detection System using Naive Bayes algorithm
    Sharmila, B. S.
    Nagapadma, Rohini
    2019 5TH IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE 2019), 2019,
  • [7] An Approach to Classify Eligibility Blood Donors Using Decision Tree and Naive Bayes Classifier
    Zulfikar, W. B.
    Gerhana, Y. A.
    Rahmania, A. F.
    2018 6TH INTERNATIONAL CONFERENCE ON CYBER AND IT SERVICE MANAGEMENT (CITSM), 2018, : 563 - 567
  • [8] Classify Text-based Email Using Naive Bayes Method With Small Sample
    Zhu, Yanjun
    Zhu, Ting
    Li, Jianxin
    Cao, Wenliang
    Yong, Peng
    Jiang, Fei
    Liu, Jie
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2023, 39 (04) : 855 - 868
  • [9] Classification of Heart Disease Using Naive Bayes and Genetic Algorithm
    Kumar, Santosh
    Sahoo, G.
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 2, 2015, 32 : 269 - 282
  • [10] Object Position Estimation Using Naive Bayes Classifier Algorithm
    Malik, Reza Firsandaya
    Pratama, Eko
    Ubaya, Huda
    Zulfahmi, Rido
    Stiawan, Deris
    Exaudi, Kemahyanto
    2018 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (ICECOS), 2018, : 39 - 43