Learning the Protein Language Model of SARS-CoV-2 Spike Proteins

被引:0
|
作者
Llanes, Paul Vincent [1 ]
Solano, Geoffrey [1 ]
Pontiveros, Marc Jermaine [2 ]
机构
[1] Univ Philippines Manila, Dept Phys Sci & Math, Manila, Philippines
[2] Univ Philippines Diliman, Dept Comp Sci, Quezon City, Philippines
关键词
SARS-CoV-2; spike proteins; sequence mutations; COVID-19; language modelling; recurrent neural network; Leiden clustering algorithm; viral escape;
D O I
10.1109/ICAIIC57133.2023.10067040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
SARS-CoV-2 virus has long been evolving posing an increased risk in terms of infectivity and transmissibility which causes greater impact in communities worldwide. With the surge of collected SARS-CoV-2 sequences, studies found out that most of the emerging variants are linked to increased mutations in the spike (S) protein as observed in Alpha, Beta, Gamma, and Delta variants. Multiple approaches on genomic surveillance have been performed to monitor the mutational status and spread of the virus however most are heavily dependent on labels attributed to these sequences. Hence, this study features a system that has the capability to learn the protein language model of SARS-CoV-2 spike proteins, based on a bidirectional long-short term memory (BiLSTM) recurrent neural network, using sequence data alone. Upon obtaining the sequence embedding from the model, observed clusters are generated using the Leiden clustering algorithm and is visualized to monitor similarities between variants in terms of grammatical probability and semantic change. Additionally, the system measures the validity of a user-generated next-generation sequence capturing potential sequence mutations indicative of viral escape, particularly mutations by substitutions. Further studies on methods uncovering semantic rules that govern spike proteins are recommended to learn more about other viral characteristics conclusive of the future of the COVID-19 pandemic.
引用
收藏
页码:429 / 434
页数:6
相关论文
共 50 条
  • [21] Nanomechanical collective vibration of SARS-CoV-2 spike proteins
    Cao, Changfeng
    Zhang, Guangxu
    Li, Xueling
    Wang, Yadi
    Lu, Junhong
    JOURNAL OF MOLECULAR RECOGNITION, 2024, 37 (04)
  • [22] Cytokine Signature Induced by SARS-CoV-2 Spike Protein in a Mouse Model
    Gu, Tingxuan
    Zhao, Simin
    Jin, Guoguo
    Song, Mengqiu
    Zhi, Yafei
    Zhao, Ran
    Ma, Fayang
    Zheng, Yaqiu
    Wang, Keke
    Liu, Hui
    Xin, Mingxia
    Han, Wei
    Li, Xiang
    Dong, Christopher D.
    Liu, Kangdong
    Dong, Zigang
    FRONTIERS IN IMMUNOLOGY, 2021, 11
  • [23] The Structural Differences between SARS-CoV-2 and SARS-CoV-1 Spike Proteins
    Cetin, Sena
    Ng, Sydney
    Zunino, Marzia
    FASEB JOURNAL, 2021, 35
  • [24] Characterization of a SARS-CoV-2 spike protein reference material
    Bradley B. Stocks
    Marie-Pier Thibeault
    Joseph D. Schrag
    Jeremy E. Melanson
    Analytical and Bioanalytical Chemistry, 2022, 414 : 3561 - 3569
  • [25] SARS-CoV-2 Spike Protein Destabilizes Microvascular Homeostasis
    Panigrahi, Soumya
    Goswami, Tamal
    Ferrari, Brian
    Antonelli, Christopher J.
    Bazdar, Douglas A.
    Gilmore, Hannah
    Freeman, Michael L.
    Lederman, Michael M.
    Sieg, Scott F.
    MICROBIOLOGY SPECTRUM, 2021, 9 (03):
  • [26] Evolution of the SARS-CoV-2 spike protein in the human host
    Wrobel, Antoni G.
    Benton, Donald J.
    Roustan, Chloe
    Borg, Annabel
    Hussain, Saira
    Martin, Stephen R.
    Rosenthal, Peter B.
    Skehel, John J.
    Gamblin, Steven J.
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [27] A thermostable, closed SARS-CoV-2 spike protein trimer
    Xiong, Xiaoli
    Qu, Kun
    Ciazynska, Katarzyna A.
    Hosmillo, Myra
    Carter, Andrew P.
    Ebrahimi, Soraya
    Ke, Zunlong
    Scheres, Sjors H. W.
    Bergamaschi, Laura
    Grice, Guinevere L.
    Zhang, Ying
    Nathan, James A.
    Baker, Stephen
    James, Leo C.
    Baxendale, Helen E.
    Goodfellow, Ian
    Doffinger, Rainer
    Briggs, John A. G.
    Bradley, John
    Lyons, Paul A.
    Smith, Kenneth G. C.
    Toshner, Mark
    Elmer, Anne
    Ribeiro, Carla
    Kourampa, Jenny
    Jose, Sherly
    Kennet, Jane
    Rowlands, Jane
    Meadows, Anne
    O'Brien, Criona
    Rastall, Rebecca
    Crucusio, Cherry
    Hewitt, Sarah
    Price, Jane
    Calder, Jo
    Canna, Laura
    Bucke, Ashlea
    Tordesillas, Hugo
    Harris, Julie
    Ruffolo, Valentina
    Domingo, Jason
    Graves, Barbara
    Butcher, Helen
    Caputo, Daniela
    Le Gresley, Emma
    Dunmore, Benjamin J.
    Martin, Jennifer
    Legchenko, Ekaterina
    Treacy, Carmen
    Huang, Christopher
    NATURE STRUCTURAL & MOLECULAR BIOLOGY, 2020, 27 (10) : 934 - +
  • [28] Computational epitope map of SARS-CoV-2 spike protein
    Sikora, Mateusz
    von Bulow, Soren
    Blanc, Florian E. C.
    Gecht, Michael
    Covino, Roberto
    Hummer, Gerhard
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (04)
  • [29] Degradative Effect of Nattokinase on Spike Protein of SARS-CoV-2
    Tanikawa, Takashi
    Kiba, Yuka
    Yu, James
    Hsu, Kate
    Chen, Shinder
    Ishii, Ayako
    Yokogawa, Takami
    Suzuki, Ryuichiro
    Inoue, Yutaka
    Kitamura, Masashi
    MOLECULES, 2022, 27 (17):
  • [30] Characterization of a SARS-CoV-2 spike protein reference material
    Stocks, Bradley B.
    Thibeault, Marie-Pier
    Schrag, Joseph D.
    Melanson, Jeremy E.
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2022, 414 (12) : 3561 - 3569