pyPAGE: A framework for Addressing biases in gene-set enrichment analysis-A case study on Alzheimer's disease

被引:0
|
作者
Bakulin, Artemy [1 ]
Teyssier, Noam B. [2 ]
Kampmann, Martin [2 ,3 ]
Khoroshkin, Matvei [3 ,4 ,5 ,6 ]
Goodarzi, Hani [3 ,4 ,5 ,6 ,7 ]
机构
[1] Lomonosov Moscow State Univ, Fac Bioengn & Bioinformat, Moscow, Russia
[2] Univ Calif San Francisco, Inst Neurodegenerat Dis, San Francisco, CA USA
[3] Univ Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94115 USA
[4] Univ Calif San Francisco, Dept Urol, San Francisco, CA 94115 USA
[5] Univ Calif San Francisco, Helen Diller Family Comprehens Canc Ctr, San Francisco, CA 94115 USA
[6] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, San Francisco, CA 94115 USA
[7] Arc Inst, Palo Alto, CA 94304 USA
关键词
MESSENGER-RNA; BINDING-PROTEINS; HNRNP K; CANCER; DIFFERENTIATION; IDENTIFICATION; LOCALIZATION; ACTIVATION; STABILITY; DISCOVERY;
D O I
10.1371/journal.pcbi.1012346
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Inferring the driving regulatory programs from comparative analysis of gene expression data is a cornerstone of systems biology. Many computational frameworks were developed to address this problem, including our iPAGE (information-theoretic Pathway Analysis of Gene Expression) toolset that uses information theory to detect non-random patterns of expression associated with given pathways or regulons. Our recent observations, however, indicate that existing approaches are susceptible to the technical biases that are inherent to most real world annotations. To address this, we have extended our information-theoretic framework to account for specific biases and artifacts in biological networks using the concept of conditional information. To showcase pyPAGE, we performed a comprehensive analysis of regulatory perturbations that underlie the molecular etiology of Alzheimer's disease (AD). pyPAGE successfully recapitulated several known AD-associated gene expression programs. We also discovered several additional regulons whose differential activity is significantly associated with AD. We further explored how these regulators relate to pathological processes in AD through cell-type specific analysis of single cell and spatial gene expression datasets. Our findings showcase the utility of pyPAGE as a precise and reliable biomarker discovery in complex diseases such as Alzheimer's disease. Biological regulation is governed by a complex network of interactions involving transcription factors, RNA-binding proteins, and microRNAs. To reveal the regulatory programs underlying gene expression modulations, researchers often take advantage of gene-set enrichment analysis, an approach that studies concerted changes in a group of genes rather than observing genes in isolation. Previously, we developed a tool called iPAGE to facilitate this analysis. However, both iPAGE and other similar tools implicitly assume that different genes have uniform gene-set membership. Our recent observations challenge this assumption, revealing that some genes form regulatory networks far more frequently than others. These technical biases and redundancies in gene-set annotations complicate the accurate inference of true regulatory relationships in specific contexts. To overcome these limitations, we introduced pyPAGE, an enhanced method that extends our information-theoretic framework by incorporating conditional mutual information to account for specific biases and artifacts.We applied pyPAGE to Alzheimer's disease, a neurodegenerative disorder marked by its complexity and characterized by progressive cognitive decline, memory loss, and behavioral changes. Pathological processes of Alzheimer's disease involve various cell types and diverse molecular mechanisms, which pose significant challenges for study. In our study we took advantage of our newly developed framework pyPAGE, performing analysis of gene expression changes at both tissue and cell type levels. This way we were able to describe regulation patterns of known regulators of Alzheimer's as well as several new ones. Additionally, we performed a cohort study of the association between the enrichment of the identified regulons and survival prognosis for patients, showing that increased activity of a subset of RBPs is positively associated with a lifespan. In total the results highlight the utility of pyPAGE and provide a valuable set of biomarkers for Alzheimer's disease.
引用
收藏
页数:34
相关论文
共 50 条
  • [1] Literature mining, gene-set enrichment and pathway analysis for target identification in Behcet's disease
    Wilson, P.
    Larminie, C.
    Smith, R.
    CLINICAL AND EXPERIMENTAL RHEUMATOLOGY, 2016, 34 (06) : S101 - S110
  • [2] Systems biology approach for gene set enrichment and topological analysis of Alzheimer's disease pathway
    Kumar, Ashwani
    Singh, Tiratha Raj
    2016 INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND SYSTEMS BIOLOGY (BSB), 2016,
  • [3] Likelihood ratio statistics for gene set enrichment in Alzheimer's disease pathways
    Bryan, Jordan
    Mandan, Arpita
    Kamat, Gauri
    Gottschalk, W. Kirby
    Badea, Alexandra
    Adams, Kendra J.
    Thompson, J. Will
    Colton, Carol A.
    Mukherjee, Sayan
    Lutz, Michael W.
    ALZHEIMERS & DEMENTIA, 2021, 17 (04) : 561 - 573
  • [4] Population-based meta-analysis and gene-set enrichment identifies FXR/RXR pathway as common to fatty liver disease and serum lipids
    Handelman, Samuel K.
    Puentes, Yindra M.
    Kuppa, Annapurna
    Chen, Yanhua
    Du, Xiaomeng
    Feitosa, Mary F.
    Palmer, Nicholette D.
    Speliotes, Elizabeth K.
    HEPATOLOGY COMMUNICATIONS, 2022, 6 (11) : 3120 - 3131
  • [5] The myeloperoxidase gene in Alzheimer's disease:: a case-control study and meta-analysis
    Combarros, O
    Infante, J
    Llorca, J
    Peña, N
    Fernández-Viadero, C
    Berciano, J
    NEUROSCIENCE LETTERS, 2002, 326 (01) : 33 - 36
  • [6] Mutation analysis of the MS4A and TREM gene clusters in a case-control Alzheimer's disease data set
    Ghani, Mahdi
    Sato, Christine
    Kakhki, Erfan Ghani
    Gibbs, J. Raphael
    Traynor, Bryan
    St George-Hyslop, Peter
    Rogaeva, Ekaterina
    NEUROBIOLOGY OF AGING, 2016, 42 : 217.e7 - 217.e13
  • [7] Association of estrogen receptor α gene with Alzheimer's disease:: A case-control study
    Monastero, Roberto
    Cefalu, Angelo B.
    Camarda, Cecilia
    Noto, Davide
    Camarda, Lawrence K.
    Caldarella, Rosalia
    Imbornone, Emilia
    Averna, Maurizio R.
    Camarda, Rosolino
    JOURNAL OF ALZHEIMERS DISEASE, 2006, 9 (03) : 273 - 278
  • [8] A genome-wide association study of Alzheimer’s disease using random forests and enrichment analysis
    Liang Zou
    Qiong Huang
    Ao Li
    MingHui Wang
    Science China Life Sciences, 2012, 55 : 618 - 625
  • [9] A genome-wide association study of Alzheimer's disease using random forests and enrichment analysis
    Zou Liang
    Huang Qiong
    Li Ao
    Wang MingHui
    SCIENCE CHINA-LIFE SCIENCES, 2012, 55 (07) : 618 - 625