GEMimp: An Accurate and Robust Imputation Method for Microbiome Data Using Graph Embedding Neural Network

被引：0

作者：

Sun, Ziwei ^{[1
]}

Song, Kai ^{[1
]}

机构：

[1] Qingdao Univ, Sch Math & Stat, Qingdao, Peoples R China

来源：

JOURNAL OF MOLECULAR BIOLOGY | 2024年 / 436卷 / 23期

基金：

中国国家自然科学基金;

关键词：

microbiome; imputation; graph embedding neural network; DIFFERENTIAL ABUNDANCE ANALYSIS; METAGENOME; REGRESSION; MODEL;

D O I：

10.1016/j.jmb.2024.168841

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Microbiome research has increasingly underscored the profound link between microbial compositions and human health, with numerous studies establishing a strong correlation between microbiome characteristics and various diseases. However, the analysis of microbiome data is frequently compromised by inherent sparsity issues, characterized by a substantial presence of observed zeros. These zeros not only skew the abundance distribution of microbial species but also undermine the reliability of scientific conclusions drawn from such data. Addressing this challenge, we introduce GEMimp, an innovative imputation method designed to infuse robustness into microbiome data analysis. GEMimp leverages the node2vec algorithm, which incorporates both Breadth-First Search (BFS) and Depth-First Search (DFS) strategies in its random walks sampling process. This approach enables GEMimp to learn nuanced, low-dimensional representations of each taxonomic unit, facilitating the reconstruction of their similarity networks with unprecedented accuracy. Our comparative analysis pits GEMimp against state-of-the-art imputation methods including SAVER, MAGIC and mbImpute. The results unequivocally demonstrate that GEMimp outperforms its counterparts by achieving the highest Pearson correlation coefficient when compared to the original raw dataset. Furthermore, GEMimp shows notable proficiency in identifying significant taxa, enhancing the detection of disease-related taxa and effectively mitigating the impact of sparsity on both simulated and real-world datasets, such as those pertaining to Type 2 Diabetes (T2D) and Colorectal Cancer (CRC). These findings collectively highlight the strong effectiveness of GEMimp, allowing for better analysis on microbial data. With alleviation of sparsity issues, it could be greatly facilitated in down (c) 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.

引用

页数：12

共 50 条

[1] mbImpute: an accurate and robust imputation method for microbiome data
Jiang, Ruochen
Li, Wei Vivian
Li, Jingyi Jessica
GENOME BIOLOGY, 2021, 22 (01)
[2] mbImpute: an accurate and robust imputation method for microbiome data
Ruochen Jiang
Wei Vivian Li
Jingyi Jessica Li
Genome Biology, 22
[3] Graph-Tensor Neural Networks for Network Traffic Data Imputation
Deng, Lei
Liu, Xiao-Yang
Zheng, Haifeng
Feng, Xinxin
Chen, Zhizhang
IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (06) : 3010 - 3024
[4] GCRINT: Network Traffic Imputation Using Graph Convolutional Recurrent Neural Network
Van An Le
Tien Thanh Le
Phi Le Nguyen
Huynh Thi Thanh Binh
Akerkar, Rajendra
Ji, Yusheng
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
[5] Embedding Imputation With Self-Supervised Graph Neural Networks
Varolgunes, Uras
Yao, Shibo
Ma, Yao
Yu, Dantong
IEEE ACCESS, 2023, 11 : 70610 - 70620
[6] Biomedical Network Link Prediction using Neural Network Graph Embedding
Kumar, Sumit
Pranesh, Raj Ratn
Shekhar, Ambesh
CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 412 - 412
[7] Toward Early and Accurate Network Intrusion Detection Using Graph Embedding
Hu, Xiaoyan
Gao, Wenjie
Cheng, Guang
Li, Ruidong
Zhou, Yuyang
Wu, Hua
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 5817 - 5831
[8] Influence maximization in social networks using graph embedding and graph neural network
Kumar, Sanjay
Mallik, Abhishek
Khetarpal, Anavi
Panda, B. S.
INFORMATION SCIENCES, 2022, 607 : 1617 - 1636
[9] Missing Pavement Performance Data Imputation Using Graph Neural Networks
Gao, Lu
Yu, Ke
Lu, Pan
TRANSPORTATION RESEARCH RECORD, 2022, 2676 (12) : 409 - 419
[10] Graph Neural Network contextual embedding for Deep Learning on tabular data
Villaizan-Vallelado, Mario
Salvatori, Matteo
Carro, Belen
Sanchez-Esguevillas, Antonio Javier
NEURAL NETWORKS, 2024, 173

← 1 2 3 4 5 →