pyPAGE: A framework for Addressing biases in gene-set enrichment analysis-A case study on Alzheimer's disease

被引:0
|
作者
Bakulin, Artemy [1 ]
Teyssier, Noam B. [2 ]
Kampmann, Martin [2 ,3 ]
Khoroshkin, Matvei [3 ,4 ,5 ,6 ]
Goodarzi, Hani [3 ,4 ,5 ,6 ,7 ]
机构
[1] Lomonosov Moscow State Univ, Fac Bioengn & Bioinformat, Moscow, Russia
[2] Univ Calif San Francisco, Inst Neurodegenerat Dis, San Francisco, CA USA
[3] Univ Calif San Francisco, Dept Biochem & Biophys, San Francisco, CA 94115 USA
[4] Univ Calif San Francisco, Dept Urol, San Francisco, CA 94115 USA
[5] Univ Calif San Francisco, Helen Diller Family Comprehens Canc Ctr, San Francisco, CA 94115 USA
[6] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, San Francisco, CA 94115 USA
[7] Arc Inst, Palo Alto, CA 94304 USA
关键词
MESSENGER-RNA; BINDING-PROTEINS; HNRNP K; CANCER; DIFFERENTIATION; IDENTIFICATION; LOCALIZATION; ACTIVATION; STABILITY; DISCOVERY;
D O I
10.1371/journal.pcbi.1012346
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Inferring the driving regulatory programs from comparative analysis of gene expression data is a cornerstone of systems biology. Many computational frameworks were developed to address this problem, including our iPAGE (information-theoretic Pathway Analysis of Gene Expression) toolset that uses information theory to detect non-random patterns of expression associated with given pathways or regulons. Our recent observations, however, indicate that existing approaches are susceptible to the technical biases that are inherent to most real world annotations. To address this, we have extended our information-theoretic framework to account for specific biases and artifacts in biological networks using the concept of conditional information. To showcase pyPAGE, we performed a comprehensive analysis of regulatory perturbations that underlie the molecular etiology of Alzheimer's disease (AD). pyPAGE successfully recapitulated several known AD-associated gene expression programs. We also discovered several additional regulons whose differential activity is significantly associated with AD. We further explored how these regulators relate to pathological processes in AD through cell-type specific analysis of single cell and spatial gene expression datasets. Our findings showcase the utility of pyPAGE as a precise and reliable biomarker discovery in complex diseases such as Alzheimer's disease. Biological regulation is governed by a complex network of interactions involving transcription factors, RNA-binding proteins, and microRNAs. To reveal the regulatory programs underlying gene expression modulations, researchers often take advantage of gene-set enrichment analysis, an approach that studies concerted changes in a group of genes rather than observing genes in isolation. Previously, we developed a tool called iPAGE to facilitate this analysis. However, both iPAGE and other similar tools implicitly assume that different genes have uniform gene-set membership. Our recent observations challenge this assumption, revealing that some genes form regulatory networks far more frequently than others. These technical biases and redundancies in gene-set annotations complicate the accurate inference of true regulatory relationships in specific contexts. To overcome these limitations, we introduced pyPAGE, an enhanced method that extends our information-theoretic framework by incorporating conditional mutual information to account for specific biases and artifacts.We applied pyPAGE to Alzheimer's disease, a neurodegenerative disorder marked by its complexity and characterized by progressive cognitive decline, memory loss, and behavioral changes. Pathological processes of Alzheimer's disease involve various cell types and diverse molecular mechanisms, which pose significant challenges for study. In our study we took advantage of our newly developed framework pyPAGE, performing analysis of gene expression changes at both tissue and cell type levels. This way we were able to describe regulation patterns of known regulators of Alzheimer's as well as several new ones. Additionally, we performed a cohort study of the association between the enrichment of the identified regulons and survival prognosis for patients, showing that increased activity of a subset of RBPs is positively associated with a lifespan. In total the results highlight the utility of pyPAGE and provide a valuable set of biomarkers for Alzheimer's disease.
引用
收藏
页数:34
相关论文
共 50 条
  • [31] Functional enrichment analysis of three Alzheimer's disease genomewide association studies identities DAB1 as a novel candidate liability/protective gene
    Gao, Hui
    Tao, Yu
    He, Qin
    Song, Fan
    Saffen, David
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2015, 463 (04) : 490 - 495
  • [32] Tau gene and Parkinson's disease: a case-control study and meta-analysis
    Healy, DG
    Abou-Sleiman, PM
    Lees, AJ
    Casas, JP
    Quinn, N
    Bhatia, K
    Hingorani, AD
    Wood, NW
    JOURNAL OF NEUROLOGY NEUROSURGERY AND PSYCHIATRY, 2004, 75 (07): : 962 - 965
  • [33] Utility of an Effect Size Analysis for Communicating Treatment Effectiveness: A Case Study of Cholinesterase Inhibitors for Alzheimer's Disease
    Peters, Kevin R.
    JOURNAL OF THE AMERICAN GERIATRICS SOCIETY, 2013, 61 (07) : 1170 - 1174
  • [34] A Histochemical Analysis of Neurofibrillary Tangles in Olfactory Epithelium, a Study Based on an Autopsy Case of Juvenile Alzheimer's Disease
    Shimizu, Shino
    Tojima, Ichiro
    Nakamura, Keigo
    Kouzaki, Hideaki
    Kanesaka, Takeshi
    Ogawa, Norihiro
    Hashizume, Yoshio
    Akatsu, Hiroyasu
    Hori, Akira
    Tooyama, Ikuo
    Shimizu, Takeshi
    ACTA HISTOCHEMICA ET CYTOCHEMICA, 2022, 55 (03) : 93 - 98
  • [35] Gene- gene interaction between PPARG and APOE gene on late-onset Alzheimer’s disease: A case- control study in Chinese han population
    Shuhua Wang
    L. Guan
    D. Luo
    J. Liu
    H. Lin
    X. Li
    X. Liu
    The journal of nutrition, health & aging, 2017, 21 : 397 - 403
  • [36] Gene- gene interaction between PPARG and APOE gene on late-onset Alzheimer's disease: A case- control study in Chinese han population
    Wang, S.
    Guan, L.
    Luo, D.
    Liu, J.
    Lin, H.
    Li, X.
    Liu, X.
    JOURNAL OF NUTRITION HEALTH & AGING, 2017, 21 (04): : 397 - 403
  • [37] Automated Thresholding Method for fNIRS-Based Functional Connectivity Analysis: Validation With a Case Study on Alzheimer's Disease
    Chan, Yee Ling
    Ung, Wei Chun
    Lim, Lam Ghai
    Lu, Cheng-Kai
    Kiguchi, Masashi
    Tang, Tong Boon
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2020, 28 (08) : 1691 - 1701
  • [38] PGRN Is Associated with Late-Onset Alzheimer’s Disease: a Case–Control Replication Study and Meta-analysis
    Hui-Min Xu
    Lin Tan
    Yu Wan
    Meng-Shan Tan
    Wei Zhang
    Zhan-Jie Zheng
    Ling-Li Kong
    Zi-Xuan Wang
    Teng Jiang
    Lan Tan
    Jin-Tai Yu
    Molecular Neurobiology, 2017, 54 : 1187 - 1195
  • [39] Adaptive neighborhood rough set model for hybrid data processing: a case study on Parkinson’s disease behavioral analysis
    Imran Raza
    Muhammad Hasan Jamal
    Rizwan Qureshi
    Abdul Karim Shahid
    Angel Olider Rojas Vistorte
    Md Abdus Samad
    Imran Ashraf
    Scientific Reports, 14
  • [40] Adaptive neighborhood rough set model for hybrid data processing: a case study on Parkinson's disease behavioral analysis
    Raza, Imran
    Jamal, Muhammad Hasan
    Qureshi, Rizwan
    Shahid, Abdul Karim
    Vistorte, Angel Olider Rojas
    Samad, Md Abdus
    Ashraf, Imran
    SCIENTIFIC REPORTS, 2024, 14 (01)