Benchmarking DNA large language models on quadruplexes

被引：0

作者：

Cherednichenko, Oleksandr ^{[1
]}

Herbert, Alan ^{[1
,2
]}

Poptsova, Maria ^{[1
]}

机构：

[1] HSE Univ, Int Lab Bioinformat, Moscow, Russia

[2] InsideOutBio, Charlestown, MA USA

来源：

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL | 2025年 / 27卷

关键词：

Foundation model; Large language model; DNABERT; HyenaDNA; MAMBA-DNA; Caduseus; Flipons; Non-B DNA; G-quadruplexes;

D O I：

10.1016/j.csbj.2025.03.007

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Large language models (LLMs) in genomics have successfully predicted various functional genomic elements. While their performance is typically evaluated using genomic benchmark datasets, it remains unclear which LLM is best suited for specific downstream tasks, particularly for generating whole-genome annotations. Current LLMs in genomics fall into three main categories: transformer-based models, long convolution-based models, and statespace models (SSMs). In this study, we benchmarked three different types of LLM architectures for generating whole-genome maps of G-quadruplexes (GQ), a type of flipons, or non-B DNA structures, characterized by distinctive patterns and functional roles in diverse regulatory contexts. Although GQ forms from folding guanosine residues into tetrads, the computational task is challenging as the bases involved may be on different strands, separated by a large number of nucleotides, or made from RNA rather than DNA. All LLMs performed comparably well, with DNABERT-2 and HyenaDNA achieving superior results based on F1 and MCC. Analysis of whole-genome annotations revealed that HyenaDNA recovered more quadruplexes in distal enhancers and intronic regions. The models were better suited to detecting large GQ arrays that likely contribute to the nuclear condensates involved in gene transcription and chromosomal scaffolds. HyenaDNA and Caduceus formed a separate grouping in the generated de novo quadruplexes, while transformer-based models clustered together. Overall, our findings suggest that different types of LLMs complement each other. Genomic architectures with varying context lengths can detect distinct functional regulatory elements, underscoring the importance of selecting the appropriate model based on the specific genomic task. The code and data underlying this article are available at https://github.com/powidla/G4s-FMs

引用

页码：992 / 1000

页数：9

共 50 条

[21] Benchmarking Vision Capabilities of Large Language Models in Surgical Examination Questions
Bereuter, Jean-Paul
Geissler, Mark Enrik
Klimova, Anna
Steiner, Robert-Patrick
Pfeiffer, Kevin
Kolbinger, Fiona R.
Wiest, Isabella C.
Muti, Hannah Sophie
Kather, Jakob Nikolas
JOURNAL OF SURGICAL EDUCATION, 2025, 82 (04)
[22] Benchmarking Large Language Models for Automated Verilog RTL Code Generation
Thakur, Shailja
Ahmad, Baleegh
Fan, Zhenxing
Pearce, Hammond
Tan, Benjamin
Karri, Ramesh
Dolan-Gavitt, Brendan
Garg, Siddharth
2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
[23] Benchmarking Large Language Models on Controllable Generation under Diversified Instructions
Chen, Yihan
Xu, Benfeng
Wang, Quan
Liu, Yi
Mao, Zhendong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17808 - 17816
[24] Benchmarking Causal Study to Interpret Large Language Models for Source Code
Rodriguez-Cardenas, Daniel
Palacio, David N.
Khati, Dipin
Burke, Henry
Poshyvanyk, Denys
2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME, 2023, : 329 - 334
[25] StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
Guo, Zhicheng
Cheng, Sijie
Wang, Hao
Liang, Shihao
Qin, Yujia
Li, Peng
Liu, Zhiyuan
Sun, Maosong
Liu, Yang
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11143 - 11156
[26] Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System
Huang, Hengguan
Wang, Songtao
Liu, Hongfu
Wang, Hao
Wang, Ye
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1624 - 1637
[27] EchoSwift An Inference Benchmarking and Configuration Discovery Tool for Large Language Models (LLMs)
Krishna, Karthik
Bandili, Ramana
COMPANION OF THE 15TH ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE COMPANION 2024, 2024, : 158 - 162
[28] Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study
Tamberg, Karl
Bahsi, Hayretdin
IEEE ACCESS, 2025, 13 : 29698 - 29717
[29] (sic) UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
Liang, Xun
Song, Shichao
Niu, Simin
Li, Zhiyu
Xiong, Feiyu
Tang, Bo
Wang, Yezhaohui
He, Dawei
Cheng, Peng
Wang, Zhonghao
Deng, Haiying
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 5266 - 5293
[30] Evaluating large language models on geospatial tasks: a multiple geospatial task benchmarking study
Xu, Liuchang
Zhao, Shuo
Lin, Qingming
Chen, Luyao
Luo, Qianqian
Wu, Sensen
Ye, Xinyue
Feng, Hailin
Du, Zhenhong
INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2025, 18 (01)

← 1 2 3 4 5 →