Protein Sequence-Based COVID-19 Detection: A Comparative Study of Machine Learning Classification Methods

被引:0
|
作者
Aminah, Siti [1 ]
Ardaneswari, Gianinna [1 ]
Awang, Mohd Khalid [2 ]
Yusaputra, Muhammad Ariq [1 ]
Sari, Dian Puspita [1 ]
机构
[1] Univ Indonesia, Fac Math & Nat Sci, Dept Math, Depok 16424, Indonesia
[2] Univ Sultan Zainal Abidin, Fac Informat & Comp, Besut 22200, Terengganu, Malaysia
关键词
Compendex;
D O I
10.1155/2024/8683822
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Coronaviruses, including severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), continue to pose a significant public health challenge globally, even in 2024. Despite advancements in vaccines and treatments, the accurate classification of coronavirus protein sequences remains crucial for monitoring variants, understanding viral behavior, and developing targeted interventions. In this study, we investigate the efficacy of various classification methods in accurately classifying coronavirus protein sequences. We explore the use of K-nearest neighbor (KNN), fuzzy KNN (FKNN), support vector machine (SVM), and SVM with particle swarm optimization (PSO-SVM) algorithms for classification, complemented by feature selection techniques including principal component analysis (PCA) and random forest-recursive feature elimination (RF-RFE). Our dataset comprises 2000 protein sequences, evenly split between SARS-CoV-2 and non-SARS-CoV-2 sequences. Through rigorous analysis, we evaluate the performance of each classification model in terms of accuracy, sensitivity, specificity, and receiver operating characteristic area under the curve (ROC-AUC). Our findings demonstrate consistently high performance across all models, reflecting their efficacy in classifying coronavirus protein sequences. Notably, the PCA + PSO-SVM model emerges as the top-performing model, exhibiting the highest classification accuracy, specificity, and ROC-AUC score, demonstrating its effectiveness in distinguishing between SARS-CoV-2 and non-SARS-CoV-2 sequences. Overall, our study highlights the importance of employing advanced classification methods and feature selection techniques in accurately classifying coronavirus protein sequences. The findings provide valuable insights for researchers and practitioners in the field of bioinformatics and contribute to ongoing efforts in understanding and combating the COVID-19 pandemic and its evolving challenges.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Machine learning for medical imaging-based COVID-19 detection and diagnosis
    Rehouma, Rokaya
    Buchert, Michael
    Chen, Yi-Ping Phoebe
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (09) : 5085 - 5115
  • [42] The COVID-19 pandemic: prediction study based on machine learning models
    Zohair Malki
    El-Sayed Atlam
    Ashraf Ewis
    Guesh Dagnew
    Osama A. Ghoneim
    Abdallah A. Mohamed
    Mohamed M. Abdel-Daim
    Ibrahim Gad
    Environmental Science and Pollution Research, 2021, 28 : 40496 - 40506
  • [43] The COVID-19 pandemic: prediction study based on machine learning models
    Malki, Zohair
    Atlam, El-Sayed
    Ewis, Ashraf
    Dagnew, Guesh
    Ghoneim, Osama A.
    Mohamed, Abdallah A.
    Abdel-Daim, Mohamed M.
    Gad, Ibrahim
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2021, 28 (30) : 40496 - 40506
  • [44] Contemporary Study for Detection of COVID-19 Using Machine Learning with Explainable AI
    Akbar, Saad
    Azam, Humera
    Almutairi, Sulaiman Sulmi
    Alqahtani, Omar
    Shah, Habib
    Aleryani, Aliya
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (01): : 1075 - 1104
  • [45] Detection and classification of lung diseases for pneumonia and Covid-19 using machine and deep learning techniques
    Shimpy Goyal
    Rajiv Singh
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 3239 - 3259
  • [46] Detection and classification of lung diseases for pneumonia and Covid-19 using machine and deep learning techniques
    Goyal, Shimpy
    Singh, Rajiv
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (4) : 3239 - 3259
  • [47] Detection and classification of Covid-19 in CT-lungs screening using machine learning techniques
    Shahin, Osama R.
    Abd El-Aziz, Rasha M.
    Taloba, Ahmed I.
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2022, 25 (03) : 791 - 813
  • [48] Statistical Machine and Deep Learning Methods for Forecasting of Covid-19
    Juneja, Mamta
    Saini, Sumindar Kaur
    Kaur, Harleen
    Jindal, Prashant
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 138 (01) : 497 - 524
  • [49] Diagnosing covid-19 lung inflammation using machine learning algorithms: A comparative study
    Ali A.M.
    Ghafoor K.Z.
    Maghdid H.S.
    Mulahuwaish A.
    Studies in Big Data, 2020, 80 : 91 - 105
  • [50] Federated learning based Covid-19 detection
    Chowdhury, Deepraj
    Banerjee, Soham
    Sannigrahi, Madhushree
    Chakraborty, Arka
    Das, Anik
    Dey, Ajoy
    Dwivedi, Ashutosh Dhar
    EXPERT SYSTEMS, 2023, 40 (05)