pyPAGE: A framework for Addressing biases in gene-set enrichment analysis-A case study on Alzheimer's disease

被引:0
|
作者
Bakulin, Artemy [1 ]
Teyssier, Noam B. [2 ]
Kampmann, Martin [2 ,3 ]
Khoroshkin, Matvei [3 ,4 ,5 ,6 ]
Goodarzi, Hani [3 ,4 ,5 ,6 ,7 ]
机构
[1] Lomonosov Moscow State Univ, Fac Bioengn & Bioinformat, Moscow, Russia
[2] Univ Calif San Francisco, Inst Neurodegenerat Dis, San Francisco, CA USA
[3] Univ Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94115 USA
[4] Univ Calif San Francisco, Dept Urol, San Francisco, CA 94115 USA
[5] Univ Calif San Francisco, Helen Diller Family Comprehens Canc Ctr, San Francisco, CA 94115 USA
[6] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, San Francisco, CA 94115 USA
[7] Arc Inst, Palo Alto, CA 94304 USA
关键词
MESSENGER-RNA; BINDING-PROTEINS; HNRNP K; CANCER; DIFFERENTIATION; IDENTIFICATION; LOCALIZATION; ACTIVATION; STABILITY; DISCOVERY;
D O I
10.1371/journal.pcbi.1012346
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Inferring the driving regulatory programs from comparative analysis of gene expression data is a cornerstone of systems biology. Many computational frameworks were developed to address this problem, including our iPAGE (information-theoretic Pathway Analysis of Gene Expression) toolset that uses information theory to detect non-random patterns of expression associated with given pathways or regulons. Our recent observations, however, indicate that existing approaches are susceptible to the technical biases that are inherent to most real world annotations. To address this, we have extended our information-theoretic framework to account for specific biases and artifacts in biological networks using the concept of conditional information. To showcase pyPAGE, we performed a comprehensive analysis of regulatory perturbations that underlie the molecular etiology of Alzheimer's disease (AD). pyPAGE successfully recapitulated several known AD-associated gene expression programs. We also discovered several additional regulons whose differential activity is significantly associated with AD. We further explored how these regulators relate to pathological processes in AD through cell-type specific analysis of single cell and spatial gene expression datasets. Our findings showcase the utility of pyPAGE as a precise and reliable biomarker discovery in complex diseases such as Alzheimer's disease. Biological regulation is governed by a complex network of interactions involving transcription factors, RNA-binding proteins, and microRNAs. To reveal the regulatory programs underlying gene expression modulations, researchers often take advantage of gene-set enrichment analysis, an approach that studies concerted changes in a group of genes rather than observing genes in isolation. Previously, we developed a tool called iPAGE to facilitate this analysis. However, both iPAGE and other similar tools implicitly assume that different genes have uniform gene-set membership. Our recent observations challenge this assumption, revealing that some genes form regulatory networks far more frequently than others. These technical biases and redundancies in gene-set annotations complicate the accurate inference of true regulatory relationships in specific contexts. To overcome these limitations, we introduced pyPAGE, an enhanced method that extends our information-theoretic framework by incorporating conditional mutual information to account for specific biases and artifacts.We applied pyPAGE to Alzheimer's disease, a neurodegenerative disorder marked by its complexity and characterized by progressive cognitive decline, memory loss, and behavioral changes. Pathological processes of Alzheimer's disease involve various cell types and diverse molecular mechanisms, which pose significant challenges for study. In our study we took advantage of our newly developed framework pyPAGE, performing analysis of gene expression changes at both tissue and cell type levels. This way we were able to describe regulation patterns of known regulators of Alzheimer's as well as several new ones. Additionally, we performed a cohort study of the association between the enrichment of the identified regulons and survival prognosis for patients, showing that increased activity of a subset of RBPs is positively associated with a lifespan. In total the results highlight the utility of pyPAGE and provide a valuable set of biomarkers for Alzheimer's disease.
引用
收藏
页数:34
相关论文
共 50 条
  • [21] Verbal Decision Analysis Applied on the Optimization of Alzheimer's Disease Diagnosis: A Case Study Based on Neuroimaging
    Tamanini, Isabelle
    de Castro, Ana Karoline
    Pinheiro, Placido Rogerio
    Dantas Pinheiro, Mirian Caliope
    SOFTWARE TOOLS AND ALGORITHMS FOR BIOLOGICAL SYSTEMS, 2011, 696 : 555 - 564
  • [22] Identification of disease-specific bio-markers through network-based analysis of gene co-expression: A case study on Alzheimer's disease
    Zheng, Hexiang
    Gu, Changgui
    Yang, Huijie
    HELIYON, 2024, 10 (05)
  • [23] Neurotrophin growth factors and their receptors as promising blood biomarkers for Alzheimer's Disease: a gene expression analysis study
    Asadi, Mohammad Reza
    Gharesouran, Jalal
    Sabaie, Hani
    Zaboli Mahdiabadi, Morteza
    Mazhari, Seyed Amirhossein
    Sharifi-Bonab, Mirmohsen
    Shirvani-Farsani, Zeinab
    Taheri, Mohammad
    Sayad, Arezou
    Rezazadeh, Maryam
    MOLECULAR BIOLOGY REPORTS, 2024, 51 (01)
  • [24] Cathepsin D gene and the risk of Alzheimer's disease: A population-based study and meta-analysis
    Schuur, M.
    Ikram, M. A.
    van Swieten, J. C.
    Isaacs, A.
    Vergeer-Drop, J. M.
    Hofman, A.
    Oostra, B. A.
    Breteler, M. M. B.
    van Duijn, C. M.
    NEUROBIOLOGY OF AGING, 2011, 32 (09) : 1607 - 1614
  • [25] Adiponectin Gene Polymorphisms: A Case-Control Study on Their Role in Late-Onset Alzheimer's Disease Risk
    Javor, Juraj
    Durmanova, Vladimira
    Kluckova, Kristina
    Parnicka, Zuzana
    Radosinska, Dominika
    Sutovsky, Stanislav
    Vaseckova, Barbora
    Reznakova, Veronika
    Kralova, Maria
    Gmitterova, Karin
    Zorad, Stefan
    Shawkatova, Ivana
    LIFE-BASEL, 2024, 14 (03):
  • [26] Polymorphism of the regulatory region of the presenilin-2 gene in sporadic Alzheimer's disease:: A case-control study
    Quan, WX
    Yasuda, M
    Hashimoto, M
    Yamamoto, Y
    Ishii, K
    Kazui, H
    Mori, E
    Kakigi, T
    Maeda, K
    JOURNAL OF THE NEUROLOGICAL SCIENCES, 2006, 240 (1-2) : 71 - 75
  • [27] Population based case control association study of the brain-derived neurotrophic factor gene in Alzheimer's disease
    Bodner, SM
    Arnold, S
    Berrettini, W
    NEUROBIOLOGY OF AGING, 2004, 25 : S499 - S499
  • [28] Novel intronic polymorphisms in the presenilin-2 gene and a case-control association study of Alzheimer's disease
    Honda, M
    Kaname, T
    Igata-yi, R
    Igata, T
    Hitoshi, Y
    Ogomori, K
    Miyakawa, T
    Yamamura, K
    PSYCHIATRY AND CLINICAL NEUROSCIENCES, 1999, 53 (05) : 579 - 585
  • [29] No association between the cystatin C gene polymorphism and Alzheimer's disease:: A case-control study in an Italian population
    Monastero, R
    Camarda, C
    Cefalù, AB
    Caldarella, R
    Camarda, LKC
    Noto, D
    Averna, MR
    Camarda, R
    JOURNAL OF ALZHEIMERS DISEASE, 2005, 7 (04) : 291 - 295
  • [30] The Impact of EGFR Gene Polymorphisms on the Risk of Alzheimer's Disease in a Chinese Han Population: A Case-Controlled Study
    Chen, Xiuhong
    Wang, Changhai
    Zhou, Shuangbao
    Li, Xueyong
    Wu, Lan
    MEDICAL SCIENCE MONITOR, 2018, 24 : 5035 - 5040