A new measure for gene expression biclustering based on non-parametric correlation

被引:34
|
作者
Flores, Jose L. [1 ]
Inza, Inaki [1 ]
Larranaga, Pedro [2 ]
Calvo, Borja [1 ]
机构
[1] Univ Basque Country, Dept Comp Sci & Artificial Intelligence, Intellegent Syst Grp, Donostia San Sebastian 20080, Spain
[2] Tech Univ Madrid, Dept Artificial Intelligence, Computat Intelligence Grp, Madrid 28660, Spain
关键词
Biclustering; Biomedicine; Artificial intelligence; Machine learning; MICROARRAY DATA; CLUSTERING-ALGORITHM; SELECTION; STABILITY;
D O I
10.1016/j.cmpb.2013.07.025
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: One of the emerging techniques for performing the analysis of the DNA microarray data known as biclustering is the search of subsets of genes and conditions which are coherently expressed. These subgroups provide clues about the main biological processes. Until now, different approaches to this problem have been proposed. Most of them use the mean squared residue as quality measure but relevant and interesting patterns can not be detected such as shifting, or scaling patterns. Furthermore, recent papers show that there exist new coherence patterns involved in different kinds of cancer and tumors such as inverse relationships between genes which can not be captured. Results: The proposed measure is called Spearman's biclustering measure (SBM) which performs an estimation of the quality of a bicluster based on the non-linear correlation among genes and conditions simultaneously. The search of biclusters is performed by using a evolutionary technique called estimation of distribution algorithms which uses the SBM measure as fitness function. This approach has been examined from different points of view by using artificial and real microarrays. The assessment process has involved the use of quality indexes, a set of bicluster patterns of reference including new patterns and a set of statistical tests. It has been also examined the performance using real microarrays and comparing to different algorithmic approaches such as Bimax, CC, OPSM, Plaid and xMotifs. Conclusions: SBM shows several advantages such as the ability to recognize more complex coherence patterns such as shifting, scaling and inversion and the capability to selectively marginalize genes and conditions depending on the statistical significance. (C) 2013 Elsevier Ireland Ltd. All rights reserved.
引用
收藏
页码:367 / 397
页数:31
相关论文
共 50 条
  • [1] A NON-PARAMETRIC BAYESIAN CLUSTERING FOR GENE EXPRESSION DATA
    Wang, Liming
    Wang, Xiaodong
    2012 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2012, : 556 - 559
  • [2] A NON-PARAMETRIC MEASURE OF POVERTY ELASTICITY
    Chambers, Dustin
    Dhongde, Shatakshee
    REVIEW OF INCOME AND WEALTH, 2011, 57 (04) : 683 - 703
  • [3] A Copula-Based Non-parametric Measure of Regression Dependence
    Dette, Holger
    Siburg, Karl F.
    Stoimenov, Pavel A.
    SCANDINAVIAN JOURNAL OF STATISTICS, 2013, 40 (01) : 21 - 41
  • [4] Biclustering of Gene Expression Data Based on SimUI Semantic Similarity Measure
    Nepomuceno, Juan A.
    Troncoso, Alicia
    Nepomuceno-Chamorro, Isabel A.
    Aguilar-Ruiz, Jesus S.
    Hybrid Artificial Intelligent Systems, 2016, 9648 : 685 - 693
  • [5] Biclustering of Gene Expression Data by Correlation-Based Scatter Search
    Nepomuceno, Juan A.
    Troncoso, Alicia
    Aguilar-Ruiz, Jesus S.
    BIODATA MINING, 2011, 4
  • [6] Biclustering of Gene Expression Data by Correlation-Based Scatter Search
    Juan A Nepomuceno
    Alicia Troncoso
    Jesús S Aguilar-Ruiz
    BioData Mining, 4
  • [7] Jonckheere-Terpstra-Kendall-based non-parametric analysis of temporal differential gene expression
    Iuchi, Hitoshi
    Hamada, Michiaki
    NAR GENOMICS AND BIOINFORMATICS, 2021, 3 (01)
  • [8] On non-parametric measures of correlation for directional data
    Alvo, M
    ENVIRONMETRICS, 1998, 9 (06) : 645 - 656
  • [10] Non-parametric confidence intervals for covariance and correlation
    Withers C.S.
    Nadarajah S.
    METRON, 2014, 72 (3) : 283 - 306