IPEV: identification of prokaryotic and eukaryotic virus-derived sequences in virome using deep learning

被引:0
|
作者
Yin, Hengchuang [1 ,2 ]
Wu, Shufang [1 ,2 ]
Tan, Jie [1 ,2 ]
Guo, Qian [1 ,2 ]
Li, Mo [1 ,2 ,3 ]
Guo, Jinyuan [1 ,2 ,4 ]
Wang, Yaqi [1 ,2 ]
Jiang, Xiaoqing [1 ,2 ,5 ]
Zhu, Huaiqiu [1 ,2 ,3 ,4 ]
机构
[1] Peking Univ, Coll Future Technol, Dept Biomed Engn, Beijing 100871, Peoples R China
[2] Peking Univ, Ctr Quantitat Biol, Beijing 100871, Peoples R China
[3] Peking Univ, Sch Life Sci, Beijing 100871, Peoples R China
[4] Georgia Inst Technol, Dept Biomed Engn, Atlanta, GA 30332 USA
[5] Emory Univ, Atlanta, GA 30332 USA
来源
GIGASCIENCE | 2024年 / 13卷
基金
中国国家自然科学基金;
关键词
HUMAN GUT VIROME; INFECTION; BACTERIA;
D O I
10.1093/gigascience/giae018
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background The virome obtained through virus-like particle enrichment contains a mixture of prokaryotic and eukaryotic virus-derived fragments. Accurate identification and classification of these elements are crucial to understanding their roles and functions in microbial communities. However, the rapid mutation rates of viral genomes pose challenges in developing high-performance tools for classification, potentially limiting downstream analyses.Findings We present IPEV, a novel method to distinguish prokaryotic and eukaryotic viruses in viromes, with a 2-dimensional convolutional neural network combining trinucleotide pair relative distance and frequency. Cross-validation assessments of IPEV demonstrate its state-of-the-art precision, significantly improving the F1-score by approximately 22% on an independent test set compared to existing methods when query viruses share less than 30% sequence similarity with known viruses. Furthermore, IPEV outperforms other methods in accuracy on marine and gut virome samples based on annotations by sequence alignments. IPEV reduces runtime by at most 1,225 times compared to existing methods under the same computing configuration. We also utilized IPEV to analyze longitudinal samples and found that the gut virome exhibits a higher degree of temporal stability than previously observed in persistent personal viromes, providing novel insights into the resilience of the gut virome in individuals.Conclusions IPEV is a high-performance, user-friendly tool that assists biologists in identifying and classifying prokaryotic and eukaryotic viruses within viromes. The tool is available at https://github.com/basehc/IPEV.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Identification of RNA Virus-Derived RdRp Sequences in Publicly Available Transcriptomic Data Sets
    Olendraite, Ingrida
    Brown, Katherine
    Firth, Andrew E.
    MOLECULAR BIOLOGY AND EVOLUTION, 2023, 40 (04)
  • [2] Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks
    Umarov, Ramzan Kh.
    Solovyev, Victor V.
    PLOS ONE, 2017, 12 (02):
  • [3] Identification of Feline Foamy Virus-derived MicroRNAs
    Aso, Shiro
    Kitao, Koichi
    Hashimoto-Gotoh, Akira
    Sakaguchi, Shoichi
    Miyazawa, Takayuki
    MICROBES AND ENVIRONMENTS, 2021, 36 (04)
  • [4] Identification of virus-derived siRNAs and their targets in RBSDV-infected rice by deep sequencing
    Lan, Ying
    Li, Yanwu
    E, Zhiguo
    Sun, Feng
    Du, Linlin
    Xu, Qiufang
    Zhou, Tong
    Zhou, Yijun
    Fan, Yongjian
    JOURNAL OF BASIC MICROBIOLOGY, 2018, 58 (03) : 227 - 237
  • [5] The immunogenicity of virus-derived 2A sequences in immunocompetent individuals
    Arber, C.
    Abhyankar, H.
    Heslop, H. E.
    Brenner, M. K.
    Liu, H.
    Dotti, G.
    Savoldo, B.
    GENE THERAPY, 2013, 20 (09) : 958 - 962
  • [6] The immunogenicity of virus-derived 2A sequences in immunocompetent individuals
    C Arber
    H Abhyankar
    H E Heslop
    M K Brenner
    H Liu
    G Dotti
    B Savoldo
    Gene Therapy, 2013, 20 : 958 - 962
  • [7] Retention of the virus-derived sequences in the nuclear genome of grapevine as a potential pathway to virus resistance
    Bertsch, Christophe
    Beuve, Monique
    Dolja, Valerian V.
    Wirth, Marion
    Pelsy, Frederique
    Herrbach, Etienne
    Lemaire, Olivier
    BIOLOGY DIRECT, 2009, 4
  • [8] Retention of the virus-derived sequences in the nuclear genome of grapevine as a potential pathway to virus resistance
    Christophe Bertsch
    Monique Beuve
    Valerian V Dolja
    Marion Wirth
    Frédérique Pelsy
    Etienne Herrbach
    Olivier Lemaire
    Biology Direct, 4
  • [9] Virus discovery by deep sequencing and assembly of virus-derived small silencing RNAs
    Wu, Qingfa
    Luo, Yingjun
    Lu, Rui
    Lau, Nelson
    Lai, Eric C.
    Li, Wan-Xiang
    Ding, Shou-Wei
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (04) : 1606 - 1611
  • [10] Identification of Himetobi P virus in the small brown planthopper by deep sequencing and assembly of virus-derived small interfering RNAs
    Xu, Yi
    Huang, Lingzhe
    Wang, Zhencheng
    Fu, Shuai
    Che, Jing
    Qian, Yajuan
    Zhou, Xueping
    VIRUS RESEARCH, 2014, 179 : 235 - 240