Learning the Protein Language Model of SARS-CoV-2 Spike Proteins

被引:0
|
作者
Llanes, Paul Vincent [1 ]
Solano, Geoffrey [1 ]
Pontiveros, Marc Jermaine [2 ]
机构
[1] Univ Philippines Manila, Dept Phys Sci & Math, Manila, Philippines
[2] Univ Philippines Diliman, Dept Comp Sci, Quezon City, Philippines
关键词
SARS-CoV-2; spike proteins; sequence mutations; COVID-19; language modelling; recurrent neural network; Leiden clustering algorithm; viral escape;
D O I
10.1109/ICAIIC57133.2023.10067040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
SARS-CoV-2 virus has long been evolving posing an increased risk in terms of infectivity and transmissibility which causes greater impact in communities worldwide. With the surge of collected SARS-CoV-2 sequences, studies found out that most of the emerging variants are linked to increased mutations in the spike (S) protein as observed in Alpha, Beta, Gamma, and Delta variants. Multiple approaches on genomic surveillance have been performed to monitor the mutational status and spread of the virus however most are heavily dependent on labels attributed to these sequences. Hence, this study features a system that has the capability to learn the protein language model of SARS-CoV-2 spike proteins, based on a bidirectional long-short term memory (BiLSTM) recurrent neural network, using sequence data alone. Upon obtaining the sequence embedding from the model, observed clusters are generated using the Leiden clustering algorithm and is visualized to monitor similarities between variants in terms of grammatical probability and semantic change. Additionally, the system measures the validity of a user-generated next-generation sequence capturing potential sequence mutations indicative of viral escape, particularly mutations by substitutions. Further studies on methods uncovering semantic rules that govern spike proteins are recommended to learn more about other viral characteristics conclusive of the future of the COVID-19 pandemic.
引用
收藏
页码:429 / 434
页数:6
相关论文
共 50 条
  • [41] Neutralizing antibodies targeting SARS-CoV-2 spike protein
    Shi Xiaojie
    Li Yu
    Yan Lei
    Yang Guang
    Qiang Min
    STEM CELL RESEARCH, 2021, 50
  • [42] Mutation profile of SARS-CoV-2 spike protein and identification of potential multiple epitopes within spike protein for vaccine development against SARS-CoV-2
    Paul D.
    Pyne N.
    Paul S.
    VirusDisease, 2021, 32 (4) : 703 - 726
  • [43] The SARS-CoV-2 Spike protein has a broad tropism for mammalian ACE2 proteins
    Conceicao, Carina
    Thakur, Nazia
    Human, Stacey
    Kelly, James T.
    Logan, Leanne
    Bialy, Dagmara
    Bhat, Sushant
    Stevenson-Leggett, Phoebe
    Zagrajek, Adrian K.
    Hollinghurst, Philippa
    Varga, Michal
    Tsirigoti, Christina
    Tully, Matthew
    Chiu, Chris
    Moffat, Katy
    Silesian, Adrian Paul
    Hammond, John A.
    Maier, Helena J.
    Bickerton, Erica
    Shelton, Holly
    Dietrich, Isabelle
    Graham, Stephen C.
    Bailey, Dalan
    PLOS BIOLOGY, 2020, 18 (12)
  • [44] Structures and distributions of SARS-CoV-2 spike proteins on intact virions
    Zunlong Ke
    Joaquin Oton
    Kun Qu
    Mirko Cortese
    Vojtech Zila
    Lesley McKeane
    Takanori Nakane
    Jasenko Zivanov
    Christopher J. Neufeldt
    Berati Cerikan
    John M. Lu
    Julia Peukes
    Xiaoli Xiong
    Hans-Georg Kräusslich
    Sjors H. W. Scheres
    Ralf Bartenschlager
    John A. G. Briggs
    Nature, 2020, 588 : 498 - 502
  • [45] Impact of SARS-CoV-2 Spike Proteins on the Islet Microvascular Function
    Barboza, Catarina
    Goncalves, Luciana Mateus
    Almaca, Joana
    DIABETES, 2024, 73
  • [46] Structures and distributions of SARS-CoV-2 spike proteins on intact virions
    Ke, Zunlong
    Oton, Joaquin
    Qu, Kun
    Cortese, Mirko
    Zila, Vojtech
    McKeane, Lesley
    Nakane, Takanori
    Zivanov, Jasenko
    Neufeldt, Christopher J.
    Cerikan, Berati
    Lu, John M.
    Peukes, Julia
    Xiong, Xiaoli
    Krausslich, Hans-Georg
    Scheres, Sjors H. W.
    Bartenschlager, Ralf
    Briggs, John A. G.
    NATURE, 2020, 588 (7838) : 498 - +
  • [47] O-Glycosylation Landscapes of SARS-CoV-2 Spike Proteins
    Zhang, Yong
    Zhao, Wanjun
    Mao, Yonghong
    Chen, Yaohui
    Zheng, Shanshan
    Cao, Wei
    Zhu, Jingqiang
    Hu, Liqiang
    Gong, Meng
    Cheng, Jingqiu
    Yang, Hao
    FRONTIERS IN CHEMISTRY, 2021, 9
  • [48] Experimental Model of Pulmonary Inflammation Induced by SARS-CoV-2 Spike Protein and Endotoxin
    Puthia, Manoj
    Tanner, Lloyd
    Petruk, Ganna
    Schmidtchen, Artur
    ACS PHARMACOLOGY & TRANSLATIONAL SCIENCE, 2022, 5 (03) : 141 - 148
  • [50] Photoelectrochemical biosensing platform for the SARS-CoV-2 spike and nucleocapsid proteins
    Nascimento Botelho, Chirlene
    de Menezes, Alan Silva
    Silva, Saimon Moraes
    Kubota, Lauro Tatsuo
    Damos, Flavio Santos
    Luz, Rita de Cassia Silva
    ELECTROANALYSIS, 2023, 35 (10)