Protein Sequence-Based COVID-19 Detection: A Comparative Study of Machine Learning Classification Methods

被引:0
|
作者
Aminah, Siti [1 ]
Ardaneswari, Gianinna [1 ]
Awang, Mohd Khalid [2 ]
Yusaputra, Muhammad Ariq [1 ]
Sari, Dian Puspita [1 ]
机构
[1] Univ Indonesia, Fac Math & Nat Sci, Dept Math, Depok 16424, Indonesia
[2] Univ Sultan Zainal Abidin, Fac Informat & Comp, Besut 22200, Terengganu, Malaysia
关键词
Compendex;
D O I
10.1155/2024/8683822
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Coronaviruses, including severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), continue to pose a significant public health challenge globally, even in 2024. Despite advancements in vaccines and treatments, the accurate classification of coronavirus protein sequences remains crucial for monitoring variants, understanding viral behavior, and developing targeted interventions. In this study, we investigate the efficacy of various classification methods in accurately classifying coronavirus protein sequences. We explore the use of K-nearest neighbor (KNN), fuzzy KNN (FKNN), support vector machine (SVM), and SVM with particle swarm optimization (PSO-SVM) algorithms for classification, complemented by feature selection techniques including principal component analysis (PCA) and random forest-recursive feature elimination (RF-RFE). Our dataset comprises 2000 protein sequences, evenly split between SARS-CoV-2 and non-SARS-CoV-2 sequences. Through rigorous analysis, we evaluate the performance of each classification model in terms of accuracy, sensitivity, specificity, and receiver operating characteristic area under the curve (ROC-AUC). Our findings demonstrate consistently high performance across all models, reflecting their efficacy in classifying coronavirus protein sequences. Notably, the PCA + PSO-SVM model emerges as the top-performing model, exhibiting the highest classification accuracy, specificity, and ROC-AUC score, demonstrating its effectiveness in distinguishing between SARS-CoV-2 and non-SARS-CoV-2 sequences. Overall, our study highlights the importance of employing advanced classification methods and feature selection techniques in accurately classifying coronavirus protein sequences. The findings provide valuable insights for researchers and practitioners in the field of bioinformatics and contribute to ongoing efforts in understanding and combating the COVID-19 pandemic and its evolving challenges.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] COVID-19 studies involving machine learning methods: A bibliometric study
    Eden, Arzu Baygul
    Kayi, Alev Bakir
    Erdem, Mustafa Genco
    Demirci, Mehmet
    MEDICINE, 2023, 102 (43) : E35564
  • [22] An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection
    Kouanou, Aurelle Tchagna
    Attia, Thomas Mih
    Feudjio, Cyrille
    Djeumo, Anges Fleurio
    Mouelas, Adele Ngo
    Nzogang, Mendel Patrice
    Tchapga, Christian Tchito
    Tchiotsop, Daniel
    JOURNAL OF HEALTHCARE ENGINEERING, 2021, 2021
  • [23] Classification and Detection of Rumors Related to COVID-19 Using Machine Learning-Based Smart Techniques
    Yang, Yancheng
    Zhai, Junqiao
    Nazir, Shah
    SAGE OPEN, 2025, 15 (01):
  • [24] Comparative Analysis of COVID-19 Detection Methods Based on Neural Network
    Hilali-Jaghdam, Ines
    Elhag, Azhari A.
    Ben Ishak, Anis
    Elnaim, Bushra M. Elamin
    Elhag, Omer Eltag Mohammed
    Abuhaimed, Feda Muhammed
    Abdel-Khalek, S.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 76 (01): : 1127 - 1150
  • [25] Machine Learning Methods on COVID-19 Situation Prediction
    Yang, Zhihao
    Chen, Kang'an
    2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER ENGINEERING (ICAICE 2020), 2020, : 78 - 83
  • [26] Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic
    Gulati, Kamal
    Kumar, S. Saravana
    Boddu, Raja Sarath Kumar
    Sarvakar, Ketan
    Sharma, Dilip Kumar
    Nomani, M. Z. M.
    MATERIALS TODAY-PROCEEDINGS, 2022, 51 : 38 - 41
  • [27] MODELING AND CLASSIFICATION OF DEATHS DUE TO COVID-19 BASED ON MACHINE LEARNING TECHNIQUE
    Alharbi, Randa
    THERMAL SCIENCE, 2023, 27 (01): : 405 - 410
  • [28] Sequence-Based Prediction of Plant Allergenic Proteins: Machine Learning Classification Approach
    Nedyalkova, Miroslava
    Vasighi, Mahdi
    Azmoon, Amirreza
    Naneva, Ludmila
    Simeonov, Vasil
    ACS OMEGA, 2023, : 3698 - 3704
  • [29] A review of deep learning-based detection methods for COVID-19
    Subramanian, Nandhini
    Elharrouss, Omar
    Al-Maadeed, Somaya
    Chowdhury, Muhammed
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 143
  • [30] Machine Learning Methods for Model Classification: A Comparative Study
    Hernandez Lopez, Jose Antonio
    Rubei, Riccardo
    Sanchez Cuadrado, Jesus
    di Ruscio, Davide
    PROCEEDINGS OF THE 25TH INTERNATIONAL ACM/IEEE CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS 2022, 2022, : 165 - 175