Blood cancer prediction using leukemia microarray gene data and hybrid logistic vector trees model

被引:0
|
作者
Vaibhav Rupapara
Furqan Rustam
Wajdi Aljedaani
Hina Fatima Shahzad
Ernesto Lee
Imran Ashraf
机构
[1] Florida International University,School of Computing and Information Sciences
[2] Khwaja Fareed University of Engineering and Information Technology,Department of Computer Science
[3] University of North Texas,Department of Computer Science and Engineering
[4] Broward College,Department of Computer Science
[5] Yeungnam University,Department of Information and Communication Engineering
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Blood cancer has been a growing concern during the last decade and requires early diagnosis to start proper treatment. The diagnosis process is costly and time-consuming involving medical experts and several tests. Thus, an automatic diagnosis system for its accurate prediction is of significant importance. Diagnosis of blood cancer using leukemia microarray gene data and machine learning approach has become an important medical research today. Despite research efforts, desired accuracy and efficiency necessitate further enhancements. This study proposes an approach for blood cancer disease prediction using the supervised machine learning approach. For the current study, the leukemia microarray gene dataset containing 22,283 genes, is used. ADASYN resampling and Chi-squared (Chi2) features selection techniques are used to resolve imbalanced and high-dimensional dataset problems. ADASYN generates artificial data to make the dataset balanced for each target class, and Chi2 selects the best features out of 22,283 to train learning models. For classification, a hybrid logistics vector trees classifier (LVTrees) is proposed which utilizes logistic regression, support vector classifier, and extra tree classifier. Besides extensive experiments on the datasets, performance comparison with the state-of-the-art methods has been made for determining the significance of the proposed approach. LVTrees outperform all other models with ADASYN and Chi2 techniques with a significant 100% accuracy. Further, a statistical significance T-test is also performed to show the efficacy of the proposed approach. Results using k-fold cross-validation prove the supremacy of the proposed model.
引用
收藏
相关论文
共 50 条
  • [21] Integrating Feature Ranking with Ensemble Learning and Logistic Model Trees for the Prediction of Postprandial Blood Glucose Elevation
    Chen, Jason Chou-Hong
    Kang, Hsiao-Yen
    Wang, Mei-Chin
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2018, 24 (06) : 797 - 812
  • [22] Gene Selection for Microarray Data Classification Using Hybrid Meta-Heuristics
    Dif, Nassima
    Attaoui, Mohamed Walid
    Elberrichi, Zakaria
    MODELLING AND IMPLEMENTATION OF COMPLEX SYSTEMS, 2019, 64 : 119 - 132
  • [23] Improving classification accuracy of cancer types using parallel hybrid feature selection on microarray gene expression data
    Venkataramana, Lokeswari
    Jacob, Shomona Gracia
    Ramadoss, Rajavel
    Saisuma, Dodda
    Haritha, Dommaraju
    Manoja, Kunthipuram
    GENES & GENOMICS, 2019, 41 (11) : 1301 - 1313
  • [24] Improving classification accuracy of cancer types using parallel hybrid feature selection on microarray gene expression data
    Lokeswari Venkataramana
    Shomona Gracia Jacob
    Rajavel Ramadoss
    Dodda Saisuma
    Dommaraju Haritha
    Kunthipuram Manoja
    Genes & Genomics, 2019, 41 : 1301 - 1313
  • [25] Efficient attribute selection technique for leukaemia prediction using microarray gene data
    Santhakumar, D.
    Logeswari, S.
    SOFT COMPUTING, 2020, 24 (18) : 14265 - 14274
  • [26] Classification of microarray gene expression data using a new binary support vector system
    Chen, TS
    Chen, RC
    Lin, CC
    Tsai, TH
    Li, SY
    Liang, X
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 485 - 489
  • [27] Impact of Feature Selection on Support Vector Machine Using Microarray Gene Expression Data
    Wahid, Choudhury Muhammad Mufassil
    Ali, A. B. M. Shawkat
    Tickle, Kevin
    2009 SECOND INTERNATIONAL CONFERENCE ON MACHINE VISION, PROCEEDINGS, ( ICMV 2009), 2009, : 189 - 193
  • [28] Dimension reduction-based penalized logistic regression for cancer classification using microarray data
    Shen, L
    Tan, EC
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2005, 2 (02) : 166 - 175
  • [29] Classification of breast cancer using microarray gene expression data: A survey
    Abd-Elnaby, Muhammed
    Alfonse, Marco
    Roushdy, Mohamed
    JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 117
  • [30] Cancer Classification by Sparse Representation using Microarray Gene Expression Data
    Hang, Xiyi
    2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, PROCEEDINGS, 2008, : 174 - 177