A comprehensive sensitivity analysis of microarray breast cancer classification under feature variability

被引:16
|
作者
Sontrop, Herman M. J. [1 ]
Moerland, Perry D. [2 ]
van den Ham, Rene [3 ]
Reinders, Marcel J. T. [4 ]
Verhaegh, Wim F. J. [1 ]
机构
[1] Philips Res Labs, Mol Diagnost Dept, NL-5656 AE Eindhoven, Netherlands
[2] Acad Med Ctr, Dept Clin Epidemiol Biostat & Bioinformat, Bioinformat Lab, NL-1100 AZ Amsterdam, Netherlands
[3] Philips Res Labs, Dept Biomol Engn, NL-5656 AE Eindhoven, Netherlands
[4] Delft Univ Technol, Delft Bioinformat Lab, NL-2628 CD Delft, Netherlands
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
PROPAGATING UNCERTAINTY; PROBABILISTIC MODEL; EXPRESSION; LEVEL; PREDICTION; SIGNATURE; NORMALIZATION; METASTASIS; GENES; NOISE;
D O I
10.1186/1471-2105-10-389
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear. Results: We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight diferent datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical. Conclusion: Feature variability can have a strong impact on breast cancer signature composition, as well as the classification of individual patient samples. We therefore strongly recommend that feature variability is considered in analyzing data from microarray breast cancer expression profiling experiments.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Cancer detection with various classification models: A comprehensive feature analysis using HMM to extract a nucleotide pattern
    Kalal, Vijay
    Jha, Brajesh Kumar
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2024, 113
  • [42] A Survey on Hybrid Feature Selection Methods in Microarray Gene Expression Data for Cancer Classification
    Almugren, Nada
    Alshamlan, Hala
    IEEE ACCESS, 2019, 7 : 78533 - 78548
  • [43] Feature selection and classification approaches in gene expression of breast cancer
    Ghosh, Sarada
    Samanta, Guruprasad
    De la Sen, Manuel
    AIMS BIOPHYSICS, 2021, 8 (04): : 372 - 384
  • [44] An incremental feature selection approach based on scatter matrices for classification of cancer microarray data
    Sardana, Manju
    Agrawal, R. K.
    Kaur, Baljeet
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2015, 92 (02) : 277 - 295
  • [45] A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data
    Wang, Hong
    Jing, Xingjian
    Niu, Ben
    KNOWLEDGE-BASED SYSTEMS, 2017, 126 : 8 - 19
  • [46] Feature selection methods on gene expression microarray data for cancer classification: A systematic review
    Alhenawi, Esra'a
    Al-Sayyed, Rizik
    Hudaib, Amjad
    Mirjalili, Seyedali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 140
  • [47] Cancer Classification through Feature Selection and Transductive SVM Using Gene Microarray Data
    Chakraborty, Debasis
    Das, Shibu
    2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 77 - 80
  • [48] Breast Cancer Classification with Random Forest Classifier with Feature Decomposition Using Principal Component Analysis
    Chudhey, Arshdeep Singh
    Goel, Mohak
    Singh, Mrityunjay
    ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 111 - 120
  • [49] Breast Cancer Classification with Random Forest Classifier with Feature Decomposition Using Principal Component Analysis
    Abd Manan, Nur Anis Syarafinaz
    Ahmad, Wan Amiza Amneera Wan
    Sulaiman, Nik Meriam Nik
    Mahmood, Noor Zalina
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON GREEN ENVIRONMENTAL ENGINEERING AND TECHNOLOGY (ICONGEET 2021), 2022, 214 : 385 - 389
  • [50] Identification of key genes associated with cervical cancer by comprehensive analysis of transcriptome microarray and methylation microarray
    Liu, Ming-Yan
    Zhang, Hong
    Hu, Yuan-Jing
    Chen, Yu-Wei
    Zhao, Xiao-Nan
    ONCOLOGY LETTERS, 2016, 12 (01) : 473 - 478