Cancer detection with various classification models: A comprehensive feature analysis using HMM to extract a nucleotide pattern

被引:1
|
作者
Kalal, Vijay [1 ]
Jha, Brajesh Kumar [1 ]
机构
[1] Pandit Deendayal Energy Univ, Sch Technol, Dept Math, Gandhinagar 382007, Gujarat, India
关键词
Cancer detection; Nucleotide patterns; Classification models; Nucleotide sequences; Hidden Markov Model (HMM); HIDDEN MARKOV-MODELS; PROBABILISTIC FUNCTIONS; ANTICANCER PEPTIDES;
D O I
10.1016/j.compbiolchem.2024.108215
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This work presents a novel feature extraction method for identifying complex patterns in genomic sequences by employing the Hidden Markov Model (HMM). In this study, we use HMM to identify gene nucleotide patterns that are specific to malignant and non-malignant cells. Crucial genetic components DNA and RNA are involved in many biological processes that impact both healthy and malignant cells. Early patient identification is essential to successful cancer diagnosis and therapy. Varying nucleotide patterns indicate different cellular responses, which are important to understanding the molecular causes of cancer and associated disorders. We present a detailed study of nucleotide patterns in whole raw nucleotide sequences with variations in both protein sequence (CDS) and non-protein sequence (NCDS) in both malignant and non-malignant cells. Nucleotide prediction has been achieved while computational expenses are reduced by using the proposed HMM for feature extraction and selection. The classification models implemented in this work for cancer detection are Gradient-Boosted Decision Trees (GBDT), Random Forests (RF), Decision Trees (DT), and Support Vector Machines (SVM) with kernels. The suggested classification model's accuracy and 10-fold cross-validation have been validated via comprehensive case studies. The results reveal that DT and ensemble learning techniques significantly differentiate between malignant and non-malignant DNA sequences. SVM with suitable kernels improves cancer detection accuracy significantly. Combining feature reduction approaches with nucleotide pattern classifiers based on Hidden Markov models improves performance and ensures reliable cancer detection.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques
    A. S. M. Shafi
    M. M. Imran Molla
    Julakha Jahan Jui
    Mohammad Motiur Rahman
    SN Applied Sciences, 2020, 2
  • [42] Dermoscopy Cancer Detection and Classification using Geometric Feature based on Resource Constraints Device (Jetson Nano)
    Rehman, Amjad
    Yar, Hikmat
    Ayesha, Noor
    Sadad, Tariq
    2020 13TH INTERNATIONAL CONFERENCE ON DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2020), 2020, : 412 - 417
  • [43] Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques
    Shafi, A. S. M.
    Molla, M. M. Imran
    Jui, Julakha Jahan
    Rahman, Mohammad Motiur
    SN APPLIED SCIENCES, 2020, 2 (07):
  • [44] Comprehensive and Comparative Global and Local Feature Extraction Framework for Lung Cancer Detection Using CT Scan Images
    Alzubaidi, Mohammad A.
    Otoom, Mwaffaq
    Jaradat, Hamza
    IEEE ACCESS, 2021, 9 : 158140 - 158154
  • [45] Breast Cancer Classification with Random Forest Classifier with Feature Decomposition Using Principal Component Analysis
    Chudhey, Arshdeep Singh
    Goel, Mohak
    Singh, Mrityunjay
    ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 111 - 120
  • [46] Performance Analysis of Lung Cancer Classification using Multiple Feature Extraction with SVM and KNN Classifiers
    Ashwini, S. S.
    Kurain, M. Z.
    Nagaraja, M.
    2021 IEEE INTERNATIONAL CONFERENCE ON MOBILE NETWORKS AND WIRELESS COMMUNICATIONS (ICMNWC), 2021,
  • [47] Breast Cancer Classification with Random Forest Classifier with Feature Decomposition Using Principal Component Analysis
    Abd Manan, Nur Anis Syarafinaz
    Ahmad, Wan Amiza Amneera Wan
    Sulaiman, Nik Meriam Nik
    Mahmood, Noor Zalina
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON GREEN ENVIRONMENTAL ENGINEERING AND TECHNOLOGY (ICONGEET 2021), 2022, 214 : 385 - 389
  • [48] Performance analysis of various machine learning-based approaches for detection and classification of lung cancer in humans
    Gur Amrit Pal Singh
    P. K. Gupta
    Neural Computing and Applications, 2019, 31 : 6863 - 6877
  • [49] Performance analysis of various machine learning-based approaches for detection and classification of lung cancer in humans
    Singh, Gur Amrit Pal
    Gupta, P. K.
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (10): : 6863 - 6877
  • [50] Asymmetry analysis using automatic segmentation and classification for breast cancer detection in thermograms
    Qi, HR
    Head, JF
    PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-4: BUILDING NEW BRIDGES AT THE FRONTIERS OF ENGINEERING AND MEDICINE, 2001, 23 : 2866 - 2869