Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data

被引:50
|
作者
Wang, JB [1 ]
Bo, TH
Jonassen, I
Myklebost, O
Hovig, E
机构
[1] Norwegian Radium Hosp, Dept Tumor Biol, N-0310 Oslo, Norway
[2] Univ Bergen, HIB, Dept Informat, N-5020 Bergen, Norway
[3] Univ Bergen, Bergen Ctr Computat Sci, Computat Biol Unit, N-5020 Bergen, Norway
[4] Univ Oslo, Dept Mol Biosci, N-0316 Oslo, Norway
关键词
D O I
10.1186/1471-2105-4-60
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Using DNA microarrays, we have developed two novel models for tumor classification and target gene prediction. First, gene expression profiles are summarized by optimally selected Self-Organizing Maps (SOMs), followed by tumor sample classification by Fuzzy C-means clustering. Then, the prediction of marker genes is accomplished by either manual feature selection (visualizing the weighted/mean SOM component plane) or automatic feature selection (by pair-wise Fisher's linear discriminant). Results: The proposed models were tested on four published datasets: (1) Leukemia (2) Colon cancer (3) Brain tumors and (4) NCI cancer cell lines. The models gave class prediction with markedly reduced error rates compared to other class prediction approaches, and the importance of feature selection on microarray data analysis was also emphasized. Conclusions: Our models identify marker genes with predictive potential, often better than other available methods in the literature. The models are potentially useful for medical diagnostics and may reveal some insights into cancer classification. Additionally, we illustrated two limitations in tumor classification from microarray data related to the biology underlying the data, in terms of (1) the class size of data, and (2) the internal structure of classes. These limitations are not specific for the classification models used.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Fuzzy Clustering Using C-Means Method
    Krastev, Georgi
    Georgiev, Tsvetozar
    TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2015, 4 (02): : 144 - 148
  • [42] A novel data selection technique using fuzzy C-means clustering to enhance SVM-based power quality classification
    K. Manimala
    Indra Getzy David
    K. Selvi
    Soft Computing, 2015, 19 : 3123 - 3144
  • [43] A novel data selection technique using fuzzy C-means clustering to enhance SVM-based power quality classification
    Manimala, K.
    David, Indra Getzy
    Selvi, K.
    SOFT COMPUTING, 2015, 19 (11) : 3123 - 3144
  • [44] A weighted fuzzy c-means clustering model for fuzzy data
    D'Urso, P
    Giordani, P
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (06) : 1496 - 1523
  • [45] UFFDFR: Undersampling framework with denoising, fuzzy c-means clustering, and representative sample selection for imbalanced data classification
    Zheng, Ming
    Li, Tong
    Zheng, Xiaoyao
    Yu, Qingying
    Chen, Chuanming
    Zhou, Ding
    Lv, Changlong
    Yang, Weiyi
    INFORMATION SCIENCES, 2021, 576 (576) : 658 - 680
  • [46] Fuzzy c-means clustering for data with tolerance using kernel functions
    Kanzawa, Yuchi
    Endo, Yasunori
    Miyamoto, Sadaaki
    2006 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2006, : 744 - +
  • [47] A fuzzy C-means algorithm for optimizing data clustering
    Hashemi, Seyed Emadedin
    Gholian-Jouybari, Fatemeh
    Hajiaghaei-Keshteli, Mostafa
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
  • [48] Clustering Spatiotemporal Data: An Augmented Fuzzy C-Means
    Izakian, Hesam
    Pedrycz, Witold
    Jamal, Iqbal
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2013, 21 (05) : 855 - 868
  • [49] Pattern Classification of Typhoon Tracks Using the Fuzzy c-Means Clustering Method
    Kim, Hyeong-Seog
    Kim, Joo-Hong
    Ho, Chang-Hoi
    Chu, Pao-Shin
    JOURNAL OF CLIMATE, 2011, 24 (02) : 488 - 508
  • [50] Median fuzzy c-means for clustering dissimilarity data
    Geweniger, Tina
    Zuelke, Dietlind
    Hammer, Barabara
    Villmann, Thomas
    NEUROCOMPUTING, 2010, 73 (7-9) : 1109 - 1116