PreAlgPro: Prediction of allergenic proteins with pre-trained protein language model and efficient neutral network

被引：1

作者：

Zhang, Lingrong ^{[1
]}

Liu, Taigang ^{[1
]}

机构：

[1] Shanghai Ocean Univ, Coll Informat Technol, Shanghai 201306, Peoples R China

来源：

INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES | 2024年 / 280卷

关键词：

Pre-trained protein language model; Allergenic proteins; Deep learning; Model interpretability; DATABASE;

D O I：

10.1016/j.ijbiomac.2024.135762

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Allergy is a prevalent phenomenon, involving allergens such as nuts and milk. Avoiding exposure to allergens is the most effective preventive measure against allergic reactions. However, current homology-based methods for identifying allergenic proteins encounter challenges when dealing with non-homologous data. Traditional machine learning approaches rely on manually extracted features, which lack important protein functional characteristics, including evolutionary information. Consequently, there is still considerable room for improvement in existing methods. In this study, we present PreAlgPro, a method for identifying allergenic proteins based on pre-trained protein language models and deep learning techniques. Specifically, we employed the ProtT5 model to extract protein embedding features, replacing the manual feature extraction step. Furthermore, we devised an Attention-CNN neural network architecture to identify potential features that contribute to the classification of allergenic proteins. The performance of our model was evaluated on four independent test sets, and the experimental results demonstrate that PreAlgPro surpasses existing state-of-the-art methods. Additionally, we collected allergenic protein samples to validate the robustness of the model and conducted an analysis of model interpretability.

引用

页数：11

共 50 条

[31] Knowledge Enhanced Pre-trained Language Model for Product Summarization
Yin, Wenbo
Ren, Junxiang
Wu, Yuejiao
Song, Ruilin
Liu, Lang
Cheng, Zhen
Wang, Sibo
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT II, 2022, 13552 : 263 - 273
[32] Predictive Recognition of DNA-binding Proteins Based on Pre-trained Language Model BERT
Ma, Yue
Pei, Yongzhen
Li, Changguo
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2023, 21 (06)
[33] ConfliBERT: A Pre-trained Language Model for Political Conflict and Violence
Hu, Yibo
Hosseini, MohammadSaleh
Parolin, Erick Skorupa
Osorio, Javier
Khan, Latifur
Brandt, Patrick T.
D'Orazio, Vito J.
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5469 - 5482
[34] IndicBART: A Pre-trained Model for Indic Natural Language Generation
Dabre, Raj
Shrotriya, Himani
Kunchukuttan, Anoop
Puduppully, Ratish
Khapra, Mitesh M.
Kumar, Pratyush
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1849 - 1863
[35] Pre-trained Language Model based Ranking in Baidu Search
Zou, Lixin
Zhang, Shengqiang
Cai, Hengyi
Ma, Dehong
Cheng, Suqi
Wang, Shuaiqiang
Shi, Daiting
Cheng, Zhicong
Yin, Dawei
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 4014 - 4022
[36] Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Shon, Suwon
Brusco, Pablo
Pan, Jing
Han, Kyu J.
Watanabe, Shinji
INTERSPEECH 2021, 2021, : 3420 - 3424
[37] Software Vulnerabilities Detection Based on a Pre-trained Language Model
Xu, Wenlin
Li, Tong
Wang, Jinsong
Duan, Haibo
Tang, Yahui
2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 904 - 911
[38] AraXLNet: pre-trained language model for sentiment analysis of Arabic
Alduailej, Alhanouf
Alothaim, Abdulrahman
JOURNAL OF BIG DATA, 2022, 9 (01)
[39] A survey of text classification based on pre-trained language model
Wu, Yujia
Wan, Jun
NEUROCOMPUTING, 2025, 616
[40] Integrating Pre-Trained Language Model With Physical Layer Communications
Lee, Ju-Hyung
Lee, Dong-Ho
Lee, Joohan
Pujara, Jay
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (11) : 17266 - 17278

← 1 2 3 4 5 →