Modeling and predicting mutations are critical for COVID-19 and similar pandemic preparedness. However, existing predictive models have yet to integrate the regularity and randomness of viral mutations with minimal data requirements. Here, we develop a non-demanding language model utilizing both regularity and randomness to predict candidate SARS-CoV-2 variants and mutations that might prevail. We constructed the "grammatical frameworks" of the available S1 sequences for dimension reduction and semantic representation to grasp the model's latent regularity. The mutational profile, defined as the frequency of mutations, was introduced into the model to incorporate randomness. With this model, we successfully identified and validated several variants with significantly enhanced viral infectivity and immune evasion by wet-lab experiments. By inputting the sequence data from three different time points, we detected circulating strains or vital mutations for XBB.1.16, EG.5, JN.1, and BA.2.86 strains before their emergence. In addition, our results also predicted the previously unknown variants that may cause future epidemics. With both the data validation and experiment evidence, our study represents a fast-responding, concise, and promising language model, potentially generalizable to other viral pathogens, to forecast viral evolution and detect crucial hot mutation spots, thus warning the emerging variants that might raise public health concern.
机构:
Univ Edinburgh, Med Res Council, Human Genet Unit, Inst Genet & Mol Med, Edinburgh EH4 2XU, Midlothian, ScotlandUniv Edinburgh, Med Res Council, Human Genet Unit, Inst Genet & Mol Med, Edinburgh EH4 2XU, Midlothian, Scotland
Williams, Thomas C.
Burgers, Wendy A.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Cape Town, Inst Infect Dis & Mol Med, Div Med Virol, Cape Town, South AfricaUniv Edinburgh, Med Res Council, Human Genet Unit, Inst Genet & Mol Med, Edinburgh EH4 2XU, Midlothian, Scotland
机构:
HeBei North Univ, Anim Sci Coll, Dept Vet Med, Key Lab Prevent Vet Med, Zhangjiakou 075131, Peoples R ChinaHeBei North Univ, Anim Sci Coll, Dept Vet Med, Key Lab Prevent Vet Med, Zhangjiakou 075131, Peoples R China
Zhang, R-H
Ai, X.
论文数: 0引用数: 0
h-index: 0
机构:
Tianjin Agr Univ, Collegeof Anim Sci & Vet Med, Tianjin 300384, Peoples R ChinaHeBei North Univ, Anim Sci Coll, Dept Vet Med, Key Lab Prevent Vet Med, Zhangjiakou 075131, Peoples R China
Ai, X.
Liu, Y. G.
论文数: 0引用数: 0
h-index: 0
机构:
Inner Mongolia Agr Univ, Dept Vet Med, Hohhot 010018, Peoples R ChinaHeBei North Univ, Anim Sci Coll, Dept Vet Med, Key Lab Prevent Vet Med, Zhangjiakou 075131, Peoples R China
Liu, Y. G.
Li, Ch-H
论文数: 0引用数: 0
h-index: 0
机构:
HeBei North Univ, Anim Sci Coll, Dept Vet Med, Key Lab Prevent Vet Med, Zhangjiakou 075131, Peoples R ChinaHeBei North Univ, Anim Sci Coll, Dept Vet Med, Key Lab Prevent Vet Med, Zhangjiakou 075131, Peoples R China
Li, Ch-H
Zhang, H-L
论文数: 0引用数: 0
h-index: 0
机构:
Inner Mongolia Agr Univ, Dept Vet Med, Hohhot 010018, Peoples R ChinaHeBei North Univ, Anim Sci Coll, Dept Vet Med, Key Lab Prevent Vet Med, Zhangjiakou 075131, Peoples R China
机构:
Huzhou Ctr Dis Control & Prevent, 999 Changxing Rd, Huzhou 313000, Zhejiang, Peoples R ChinaHuzhou Ctr Dis Control & Prevent, 999 Changxing Rd, Huzhou 313000, Zhejiang, Peoples R China
Zhang, Peng
Liu, Dongzi
论文数: 0引用数: 0
h-index: 0
机构:
Wuhan Univ, Coll Life Sci, State Key Lab Virol, Wuhan 430072, Peoples R ChinaHuzhou Ctr Dis Control & Prevent, 999 Changxing Rd, Huzhou 313000, Zhejiang, Peoples R China
Liu, Dongzi
Ji, Lei
论文数: 0引用数: 0
h-index: 0
机构:
Huzhou Ctr Dis Control & Prevent, 999 Changxing Rd, Huzhou 313000, Zhejiang, Peoples R ChinaHuzhou Ctr Dis Control & Prevent, 999 Changxing Rd, Huzhou 313000, Zhejiang, Peoples R China
Ji, Lei
Dong, Fenfen
论文数: 0引用数: 0
h-index: 0
机构:
Huzhou Ctr Dis Control & Prevent, 999 Changxing Rd, Huzhou 313000, Zhejiang, Peoples R ChinaHuzhou Ctr Dis Control & Prevent, 999 Changxing Rd, Huzhou 313000, Zhejiang, Peoples R China
机构:
The Oxford–Cardiff COVID-19 Literature Consortium, University of Oxford, OxfordThe Oxford–Cardiff COVID-19 Literature Consortium, University of Oxford, Oxford
机构:
Univ La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
GIP CYROI, St Denis, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Wilkinson, David A.
Mercier, Alize
论文数: 0引用数: 0
h-index: 0
机构:
Sante Publ France, St Denis, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Mercier, Alize
Turpin, Magali
论文数: 0引用数: 0
h-index: 0
机构:
Univ La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Turpin, Magali
Simbi, Marie -Alice
论文数: 0引用数: 0
h-index: 0
机构:
Univ La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
GIP CYROI, St Denis, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Simbi, Marie -Alice
Turpin, Jonathan
论文数: 0引用数: 0
h-index: 0
机构:
Univ La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
GIP CYROI, St Denis, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Turpin, Jonathan
Lebarbenchon, Camille
论文数: 0引用数: 0
h-index: 0
机构:
Univ La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Lebarbenchon, Camille
Cesari, Maya
论文数: 0引用数: 0
h-index: 0
机构:
GIP CYROI, St Denis, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Cesari, Maya
Jaffar-Bandjee, Marie -Christine
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Hosp Univ La Reunion, Lab Microbiol, St Denis, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Jaffar-Bandjee, Marie -Christine
Josset, Laurence
论文数: 0引用数: 0
h-index: 0
机构:
Hosp Civils Lyon, Inst Agents Infect, CNR Virus Infect Respiratoires, Lyon, France
Univ Lyon, Ecole Normale Super Lyon, Ctr Int Rech Infectiol, Virpath,Inserm,U1111,CNRS,UMR5308,UCBL, Lyon, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Josset, Laurence
Yemadje-Menudier, Luce
论文数: 0引用数: 0
h-index: 0
机构:
Sante Publ France, St Denis, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
Yemadje-Menudier, Luce
论文数: 引用数:
h-index:
机构:
Lina, Bruno
Mavingui, Patrick
论文数: 0引用数: 0
h-index: 0
机构:
Univ La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, FranceUniv La Reunion, UMR Proc Infect Milieu Insulaire Trop PIMIT CNRS 9, INSERM 1187, IRD 249, Sainte Clotilde, La Reunion, France
机构:Iran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Gomari, Mohammad Mahmoudi
Tarighi, Parastoo
论文数: 0引用数: 0
h-index: 0
机构:
Iran Univ Med Sci, Student Res Comm, Tehran 1449614535, IranIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Tarighi, Parastoo
Choupani, Edris
论文数: 0引用数: 0
h-index: 0
机构:
Iran Univ Med Sci, Student Res Comm, Tehran 1449614535, IranIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Choupani, Edris
Abkhiz, Shadi
论文数: 0引用数: 0
h-index: 0
机构:
Iran Univ Med Sci, Student Res Comm, Tehran 1449614535, IranIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Abkhiz, Shadi
Mohamadzadeh, Masoud
论文数: 0引用数: 0
h-index: 0
机构:
Iran Univ Med Sci, Fac Allied Med, Dept Med Biotechnol, Tehran 1449614535, IranIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Mohamadzadeh, Masoud
Rostami, Neda
论文数: 0引用数: 0
h-index: 0
机构:
Univ Hormozgan, Fac Sci, Dept Chem, Bandar Abbas 7916193145, IranIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Rostami, Neda
Sadroddiny, Esmaeil
论文数: 0引用数: 0
h-index: 0
机构:
Arak Univ, Fac Engn, Dept Chem Engn, Arak 3848177584, IranIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Sadroddiny, Esmaeil
Baammi, Soukayna
论文数: 0引用数: 0
h-index: 0
机构:
Univ Tehran Med Sci, Sch Adv Technol Med, Med Biotechnol Dept, Tehran 1417613151, IranIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Baammi, Soukayna
Uversky, Vladimir N.
论文数: 0引用数: 0
h-index: 0
机构:
Mohammed VI Polytech Univ, African Genome Ctr AGC, Benguerir 43150, Morocco
Univ S Florida, Morsani Coll Med, Dept Mol Med, Tampa, FL 33620 USA
Univ S Florida, Morsani Coll Med, USF Hlth Byrd Alzheimers Res Inst, Tampa, FL 33620 USAIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Uversky, Vladimir N.
Dokholyan, Nikolay V.
论文数: 0引用数: 0
h-index: 0
机构:
Iran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran
Iran Univ Med Sci, Fac Allied Med, Dept Med Biotechnol, Tehran 1449614535, Iran
Moscow Inst Phys & Technol, Res Ctr Mol Mech Aging & Age Related Dis, Dolgoprudnyi 141700, Russia
Penn State Univ, Coll Med, Dept Pharmacol, Dept Biochem & Mol Biol, Hershey, PA 16802 USAIran Univ Med Sci, Student Res Comm, Tehran 1449614535, Iran