Identification and Analysis of Cancer Diagnosis Using Probabilistic Classification Vector Machines with Feature Selection

被引:21
|
作者
Du, Xiuquan [1 ,2 ]
Li, Xinrui [2 ]
Li, Wen [2 ]
Yan, Yuanting [1 ,2 ]
Zhang, Yanping [1 ,2 ]
机构
[1] Anhui Univ, Key Lab Intelligent Comp & Signal Proc, Minist Educ, Hefei, Anhui, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
基金
美国国家科学基金会;
关键词
Probabilistic classification vector; feature selection; tumor classification; DX; machine learning; kernel function; GENE; PREDICTION;
D O I
10.2174/1574893612666170405125637
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The accurate classification of tumors types is mainly important for the treatment of cancer. With the progress of the microarray expression profile, many methods are proposed to deal with these data. However, because of the feature dimension of tumor gene expression profile is very high; many machine learning algorithms are failure. Objective & Methods: In this paper, a novel method named probabilistic classification vector machines (PCVM) with feature selection is proposed for tumor types detection using gene expression data, PCVM adopt a signed and truncated Gaussian prior to solve the problem of unstable solutions caused, and the complexity of the model can be controlled by the truncated Gaussian prior. The performance of PCVM is evaluated on two datasets by using four metrics. Results: This method achieves 84.21% accuracy and 95.24 % accuracy in the leukemia and prostrate dataset respectively. As compared to other methods, PCVM obtain much higher performance than Support Vector Machines (SVM), Naive Bayes (NB), RBF Neural Networks (RBF), K-nearest Neighbor (KNN), and Random Forest (RF) except SVM on Prostate dataset. In order to reduce computational time, we adopt a feature selection method (DX) to rank the features and search the optimal feature combination based on PCVM, PCVM with DX method (PCVM-DX) achieves 94.74% accuracy, 100% sensitivity, 85.71% specificity and 92.31% precision on the leukemia dataset. PCVM-DX method obtained the same result as PCVM on the prostate dataset. We also compare DX with other feature selection method; the result reveals that the PCVM-DX is efficient for tumor classification in terms of performance. Conclusion: PCVM-DX is observed to be better than the other methods in two data sets. The novelty of this approach lies in applying PCVM to tackle the same prior for different classes may lead to unstable solutions by RVMs and also exploring the important feature subset in the microarray expression profile with feature selection.
引用
收藏
页码:625 / 632
页数:8
相关论文
共 50 条
  • [31] Feature Selection Using Probabilistic Prediction of Support Vector Regression
    Yang, Jian-Bo
    Ong, Chong-Jin
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (06): : 954 - 962
  • [32] Sensitivity of Support Vector Machines to Random Feature Selection in Classification of Hyperspectral Data
    Waske, Bjoern
    van der Linden, Sebastian
    Benediktsson, Jon Atli
    Rabe, Andreas
    Hostert, Patrick
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2010, 48 (07): : 2880 - 2889
  • [33] Multi-View Scaling Support Vector Machines for Classification and Feature Selection
    Xu, Jinglin
    Han, Junwei
    Nie, Feiping
    Li, Xuelong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (07) : 1419 - 1430
  • [34] Geographical Classification of Tannat Wines Based on Support Vector Machines and Feature Selection
    Costa, Nattane Luiza
    Garcia Llobodanin, Laura Andrea
    Castro, Inar Alves
    Barbosa, Rommel
    BEVERAGES, 2018, 4 (04):
  • [35] Gene selection for cancer classification using bootstrapped genetic algorithms and support vector machines
    Chen, XW
    PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, : 504 - 505
  • [36] Gene selection and prediction for cancer classification using support vector machines with a reject option
    Choi, Hosik
    Yeo, Donghwa
    Kwon, Sunghoon
    Kim, Yongdai
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (05) : 1897 - 1908
  • [37] Investigating the effect of Correlation based Feature Selection on breast cancer diagnosis using Artificial Neural Network and Support Vector Machines
    Alyami, Reem
    Alhajjaj, Jinan
    Alnajrani, Batool
    Elaalami, Ilham
    Alqahtani, Abdullah
    Aldhafferi, Nahier
    Owolabi, Taoreed O.
    Olatunji, Sunday O.
    2017 INTERNATIONAL CONFERENCE ON INFORMATICS, HEALTH & TECHNOLOGY (ICIHT), 2017,
  • [38] Multiclass Probabilistic Classification for Support Vector Machines
    Bae, Ji-Sang
    Kim, Jong-Ok
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (06): : 1251 - 1255
  • [39] Diagnosis by Support Vector Machines combined with feature selection based on mutual information
    Sun, Z.
    Xi, G.
    Yi, J.
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2006, 13E : 736 - 741
  • [40] Kernel Fisher Discriminant Analysis Using Feature Vector Selection for Fault Diagnosis
    Wu, Hongyan
    Huang, Daoping
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL III, PROCEEDINGS, 2008, : 109 - 113