Benchmarking DNA large language models on quadruplexes

被引:0
|
作者
Cherednichenko, Oleksandr [1 ]
Herbert, Alan [1 ,2 ]
Poptsova, Maria [1 ]
机构
[1] HSE Univ, Int Lab Bioinformat, Moscow, Russia
[2] InsideOutBio, Charlestown, MA USA
关键词
Foundation model; Large language model; DNABERT; HyenaDNA; MAMBA-DNA; Caduseus; Flipons; Non-B DNA; G-quadruplexes;
D O I
10.1016/j.csbj.2025.03.007
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Large language models (LLMs) in genomics have successfully predicted various functional genomic elements. While their performance is typically evaluated using genomic benchmark datasets, it remains unclear which LLM is best suited for specific downstream tasks, particularly for generating whole-genome annotations. Current LLMs in genomics fall into three main categories: transformer-based models, long convolution-based models, and statespace models (SSMs). In this study, we benchmarked three different types of LLM architectures for generating whole-genome maps of G-quadruplexes (GQ), a type of flipons, or non-B DNA structures, characterized by distinctive patterns and functional roles in diverse regulatory contexts. Although GQ forms from folding guanosine residues into tetrads, the computational task is challenging as the bases involved may be on different strands, separated by a large number of nucleotides, or made from RNA rather than DNA. All LLMs performed comparably well, with DNABERT-2 and HyenaDNA achieving superior results based on F1 and MCC. Analysis of whole-genome annotations revealed that HyenaDNA recovered more quadruplexes in distal enhancers and intronic regions. The models were better suited to detecting large GQ arrays that likely contribute to the nuclear condensates involved in gene transcription and chromosomal scaffolds. HyenaDNA and Caduceus formed a separate grouping in the generated de novo quadruplexes, while transformer-based models clustered together. Overall, our findings suggest that different types of LLMs complement each other. Genomic architectures with varying context lengths can detect distinct functional regulatory elements, underscoring the importance of selecting the appropriate model based on the specific genomic task. The code and data underlying this article are available at https://github.com/powidla/G4s-FMs
引用
收藏
页码:992 / 1000
页数:9
相关论文
共 50 条
  • [21] Benchmarking Vision Capabilities of Large Language Models in Surgical Examination Questions
    Bereuter, Jean-Paul
    Geissler, Mark Enrik
    Klimova, Anna
    Steiner, Robert-Patrick
    Pfeiffer, Kevin
    Kolbinger, Fiona R.
    Wiest, Isabella C.
    Muti, Hannah Sophie
    Kather, Jakob Nikolas
    JOURNAL OF SURGICAL EDUCATION, 2025, 82 (04)
  • [22] Benchmarking Large Language Models for Automated Verilog RTL Code Generation
    Thakur, Shailja
    Ahmad, Baleegh
    Fan, Zhenxing
    Pearce, Hammond
    Tan, Benjamin
    Karri, Ramesh
    Dolan-Gavitt, Brendan
    Garg, Siddharth
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [23] Benchmarking Large Language Models on Controllable Generation under Diversified Instructions
    Chen, Yihan
    Xu, Benfeng
    Wang, Quan
    Liu, Yi
    Mao, Zhendong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17808 - 17816
  • [24] Benchmarking Causal Study to Interpret Large Language Models for Source Code
    Rodriguez-Cardenas, Daniel
    Palacio, David N.
    Khati, Dipin
    Burke, Henry
    Poshyvanyk, Denys
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME, 2023, : 329 - 334
  • [25] StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
    Guo, Zhicheng
    Cheng, Sijie
    Wang, Hao
    Liang, Shihao
    Qin, Yujia
    Li, Peng
    Liu, Zhiyuan
    Sun, Maosong
    Liu, Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11143 - 11156
  • [26] Benchmarking Large Language Models on Communicative Medical Coaching: A Dataset and a Novel System
    Huang, Hengguan
    Wang, Songtao
    Liu, Hongfu
    Wang, Hao
    Wang, Ye
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1624 - 1637
  • [27] EchoSwift An Inference Benchmarking and Configuration Discovery Tool for Large Language Models (LLMs)
    Krishna, Karthik
    Bandili, Ramana
    COMPANION OF THE 15TH ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING, ICPE COMPANION 2024, 2024, : 158 - 162
  • [28] Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study
    Tamberg, Karl
    Bahsi, Hayretdin
    IEEE ACCESS, 2025, 13 : 29698 - 29717
  • [29] (sic) UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
    Liang, Xun
    Song, Shichao
    Niu, Simin
    Li, Zhiyu
    Xiong, Feiyu
    Tang, Bo
    Wang, Yezhaohui
    He, Dawei
    Cheng, Peng
    Wang, Zhonghao
    Deng, Haiying
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 5266 - 5293
  • [30] Evaluating large language models on geospatial tasks: a multiple geospatial task benchmarking study
    Xu, Liuchang
    Zhao, Shuo
    Lin, Qingming
    Chen, Luyao
    Luo, Qianqian
    Wu, Sensen
    Ye, Xinyue
    Feng, Hailin
    Du, Zhenhong
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2025, 18 (01)