A predictive language model for SARS-CoV-2 evolution

被引:0
|
作者
Ma, Enhao [1 ]
Guo, Xuan [1 ,2 ]
Hu, Mingda [3 ]
Wang, Penghua [4 ]
Wang, Xin [3 ]
Wei, Congwen [3 ]
Cheng, Gong [1 ,2 ]
机构
[1] Tsinghua Univ, Sch Basic Med Sci, 30 Shuangqing Rd, Beijing 100084, Peoples R China
[2] Inst Infect Dis, Shenzhen Bay Lab, Guangqiao Rd, Shenzhen 518000, Guangdong, Peoples R China
[3] Beijing Inst Biotechnol, 20 Dongdajie, Beijing 100071, Peoples R China
[4] Univ Connecticut Hlth Ctr, Sch Med, Dept Immunol, Farmington, CT 06030 USA
基金
中国国家自然科学基金;
关键词
EVASION;
D O I
10.1038/s41392-024-02066-x
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Modeling and predicting mutations are critical for COVID-19 and similar pandemic preparedness. However, existing predictive models have yet to integrate the regularity and randomness of viral mutations with minimal data requirements. Here, we develop a non-demanding language model utilizing both regularity and randomness to predict candidate SARS-CoV-2 variants and mutations that might prevail. We constructed the "grammatical frameworks" of the available S1 sequences for dimension reduction and semantic representation to grasp the model's latent regularity. The mutational profile, defined as the frequency of mutations, was introduced into the model to incorporate randomness. With this model, we successfully identified and validated several variants with significantly enhanced viral infectivity and immune evasion by wet-lab experiments. By inputting the sequence data from three different time points, we detected circulating strains or vital mutations for XBB.1.16, EG.5, JN.1, and BA.2.86 strains before their emergence. In addition, our results also predicted the previously unknown variants that may cause future epidemics. With both the data validation and experiment evidence, our study represents a fast-responding, concise, and promising language model, potentially generalizable to other viral pathogens, to forecast viral evolution and detect crucial hot mutation spots, thus warning the emerging variants that might raise public health concern.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] The evolution of SARS-CoV-2
    Markov, Peter V.
    Ghafari, Mahan
    Beer, Martin
    Lythgoe, Katrina
    Simmonds, Peter
    Stilianakis, Nikolaos I.
    Katzourakis, Aris
    NATURE REVIEWS MICROBIOLOGY, 2023, 21 (06) : 361 - 379
  • [2] SARS-CoV-2 evolution
    Devi, Sharmila
    LANCET INFECTIOUS DISEASES, 2021, 21 (04): : 467 - 467
  • [3] The evolution of SARS-CoV-2
    Peter V. Markov
    Mahan Ghafari
    Martin Beer
    Katrina Lythgoe
    Peter Simmonds
    Nikolaos I. Stilianakis
    Aris Katzourakis
    Nature Reviews Microbiology, 2023, 21 : 361 - 379
  • [4] SARS-COV-2: SIR Model Limitations and Predictive Constraints
    Telles, Charles Roberto
    Lopes, Henrique
    Franco, Diogo
    SYMMETRY-BASEL, 2021, 13 (04):
  • [5] Peering into the future of SARS-CoV-2 evolution using a computational model of the SARS-CoV-2 serum antibody repertoire
    Cohen-Lavi, Liel
    Koren, Eilay
    Yegorov, Yan
    Sacharen, Sinai
    Burkovitz, Anat
    Hertz, Tomer
    JOURNAL OF IMMUNOLOGY, 2023, 210 (01):
  • [6] Motifs in SARS-CoV-2 evolution
    Barrett, Christopher
    Bura, Andrei C.
    He, Qijun
    Huang, Fenix W.
    Li, Thomas J. X.
    Reidys, Christian M.
    RNA, 2024, 30 (01) : 1 - 15
  • [7] The Emergence and Evolution of SARS-CoV-2
    Holmes, Edward C.
    ANNUAL REVIEW OF VIROLOGY, 2024, 11 : 21 - 42
  • [8] Origin and evolution of SARS-CoV-2
    Pagani, Isabel
    Ghezzi, Silvia
    Alberti, Simone
    Poli, Guido
    Vicenzi, Elisa
    EUROPEAN PHYSICAL JOURNAL PLUS, 2023, 138 (02):
  • [9] On the origin and evolution of SARS-CoV-2
    Singh, Devika
    Yi, Soojin, V
    EXPERIMENTAL AND MOLECULAR MEDICINE, 2021, 53 (04): : 537 - 547
  • [10] On the origin and evolution of SARS-CoV-2
    Devika Singh
    Soojin V. Yi
    Experimental & Molecular Medicine, 2021, 53 : 537 - 547