Benchmarking DNA large language models on quadruplexes

被引：0

作者：

Cherednichenko, Oleksandr ^{[1
]}

Herbert, Alan ^{[1
,2
]}

Poptsova, Maria ^{[1
]}

机构：

[1] HSE Univ, Int Lab Bioinformat, Moscow, Russia

[2] InsideOutBio, Charlestown, MA USA

来源：

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL | 2025年 / 27卷

关键词：

Foundation model; Large language model; DNABERT; HyenaDNA; MAMBA-DNA; Caduseus; Flipons; Non-B DNA; G-quadruplexes;

D O I：

10.1016/j.csbj.2025.03.007

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Large language models (LLMs) in genomics have successfully predicted various functional genomic elements. While their performance is typically evaluated using genomic benchmark datasets, it remains unclear which LLM is best suited for specific downstream tasks, particularly for generating whole-genome annotations. Current LLMs in genomics fall into three main categories: transformer-based models, long convolution-based models, and statespace models (SSMs). In this study, we benchmarked three different types of LLM architectures for generating whole-genome maps of G-quadruplexes (GQ), a type of flipons, or non-B DNA structures, characterized by distinctive patterns and functional roles in diverse regulatory contexts. Although GQ forms from folding guanosine residues into tetrads, the computational task is challenging as the bases involved may be on different strands, separated by a large number of nucleotides, or made from RNA rather than DNA. All LLMs performed comparably well, with DNABERT-2 and HyenaDNA achieving superior results based on F1 and MCC. Analysis of whole-genome annotations revealed that HyenaDNA recovered more quadruplexes in distal enhancers and intronic regions. The models were better suited to detecting large GQ arrays that likely contribute to the nuclear condensates involved in gene transcription and chromosomal scaffolds. HyenaDNA and Caduceus formed a separate grouping in the generated de novo quadruplexes, while transformer-based models clustered together. Overall, our findings suggest that different types of LLMs complement each other. Genomic architectures with varying context lengths can detect distinct functional regulatory elements, underscoring the importance of selecting the appropriate model based on the specific genomic task. The code and data underlying this article are available at https://github.com/powidla/G4s-FMs

引用

页码：992 / 1000

页数：9

共 50 条

[31] RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
Wang, Zekun Moore
Peng, Zhongyuan
Qu, Haoran
Li, Jiaheng
Zhou, Wangchunshu
Wu, Yuhan
Guo, Hongcheng
Gan, Ruitong
Ni, Zehao
Yang, Jian
Zhang, Man
Zhang, Zhaoxiang
Ouyang, Wanli
Xu, Ke
Huang, Stephen W.
Fu, Jie
Peng, Junran
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 14743 - 14777
[32] Molecular models for intrastrand DNA G-quadruplexes
Fogolari, Federico
Haridas, Haritha
Corazza, Alessandra
Viglino, Paolo
Cora, Davide
Caselle, Michele
Esposito, Gennaro
Xodo, Luigi E.
BMC STRUCTURAL BIOLOGY, 2009, 9 : 64
[33] Benchmarking protein language models for protein crystallization
Mall, Raghvendra
Kaushik, Rahul
Martinez, Zachary A.
Thomson, Matt W.
Castiglione, Filippo
SCIENTIFIC REPORTS, 2025, 15 (01):
[34] AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
Yang, Qian
Xu, Jin
Liu, Wenrui
Chu, Yunfei
Jiang, Ziyue
Zhou, Xiaohuan
Leng, Yichong
Lv, Yuanjun
Zhao, Zhou
Zhou, Chang
Zhou, Jingren
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1979 - 1998
[35] MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models
Liu, Mianxin
Hu, Weiguo
Ding, Jinru
Xu, Jie
Li, Xiaoyang
Zhu, Lifeng
Bai, Zhian
Shi, Xiaoming
Wang, Benyou
Song, Haitao
Liu, Pengfei
Zhang, Xiaofan
Wang, Shanshan
Li, Kang
Wang, Haofen
Ruan, Tong
Huang, Xuanjing
Sun, Xin
Zhang, Shaoting
BIG DATA MINING AND ANALYTICS, 2024, 7 (04): : 1116 - 1128
[36] Benchmarking and Evaluating Large Language Models in Phishing Detection for Small and Midsize Enterprises: A Comprehensive Analysis
Zhang, Jun
Wu, Peiqiao
London, Jeffrey
Tenney, Dan
IEEE ACCESS, 2025, 13 : 28335 - 28352
[37] InteNSE: Interpretability, Robustness, and Benchmarking in Neural Software Engineering (Second Edition: Large Language Models)
University of Illinois, Urbana-Champaign, United States
不详
不详
不详
Proc. - IEEE/ACM Int. Workshop Interpretability, Robust., Benchmarking Neural Softw. Eng. InteNSE, (VI):
[38] Large Language Models are Not Models of Natural Language: They are Corpus Models
Veres, Csaba
IEEE ACCESS, 2022, 10 : 61970 - 61979
[39] Large Language Models
Vargas, Diego Collarana
Katsamanis, Nassos
ERCIM NEWS, 2024, (136): : 12 - 13
[40] Large Language Models
Cerf, Vinton G.
COMMUNICATIONS OF THE ACM, 2023, 66 (08) : 7 - 7

← 1 2 3 4 5 →