Benchmarking DNA large language models on quadruplexes

被引:0
|
作者
Cherednichenko, Oleksandr [1 ]
Herbert, Alan [1 ,2 ]
Poptsova, Maria [1 ]
机构
[1] HSE Univ, Int Lab Bioinformat, Moscow, Russia
[2] InsideOutBio, Charlestown, MA USA
关键词
Foundation model; Large language model; DNABERT; HyenaDNA; MAMBA-DNA; Caduseus; Flipons; Non-B DNA; G-quadruplexes;
D O I
10.1016/j.csbj.2025.03.007
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Large language models (LLMs) in genomics have successfully predicted various functional genomic elements. While their performance is typically evaluated using genomic benchmark datasets, it remains unclear which LLM is best suited for specific downstream tasks, particularly for generating whole-genome annotations. Current LLMs in genomics fall into three main categories: transformer-based models, long convolution-based models, and statespace models (SSMs). In this study, we benchmarked three different types of LLM architectures for generating whole-genome maps of G-quadruplexes (GQ), a type of flipons, or non-B DNA structures, characterized by distinctive patterns and functional roles in diverse regulatory contexts. Although GQ forms from folding guanosine residues into tetrads, the computational task is challenging as the bases involved may be on different strands, separated by a large number of nucleotides, or made from RNA rather than DNA. All LLMs performed comparably well, with DNABERT-2 and HyenaDNA achieving superior results based on F1 and MCC. Analysis of whole-genome annotations revealed that HyenaDNA recovered more quadruplexes in distal enhancers and intronic regions. The models were better suited to detecting large GQ arrays that likely contribute to the nuclear condensates involved in gene transcription and chromosomal scaffolds. HyenaDNA and Caduceus formed a separate grouping in the generated de novo quadruplexes, while transformer-based models clustered together. Overall, our findings suggest that different types of LLMs complement each other. Genomic architectures with varying context lengths can detect distinct functional regulatory elements, underscoring the importance of selecting the appropriate model based on the specific genomic task. The code and data underlying this article are available at https://github.com/powidla/G4s-FMs
引用
收藏
页码:992 / 1000
页数:9
相关论文
共 50 条
  • [11] BLESS: Benchmarking Large Language Models on Sentence Simplification
    Kew, Tannon
    Chi, Alison
    Vasquez-Rodriguez, Laura
    Agrawal, Sweta
    Aumiller, Dennis
    Alva-Manchego, Fernando
    Shardlow, Matthew
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13291 - 13309
  • [12] TRAM: Benchmarking Temporal Reasoning for Large Language Models
    Wang, Yuqing
    Zhao, Yun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 6389 - 6415
  • [13] Benchmarking large language models for biomedical natural language processing applications and recommendations
    Chen, Qingyu
    Hu, Yan
    Peng, Xueqing
    Xie, Qianqian
    Jin, Qiao
    Gilson, Aidan
    Singer, Maxwell B.
    Ai, Xuguang
    Lai, Po-Ting
    Wang, Zhizheng
    Keloth, Vipina K.
    Raja, Kalpana
    Huang, Jimin
    He, Huan
    Lin, Fongci
    Du, Jingcheng
    Zhang, Rui
    Zheng, W. Jim
    Adelman, Ron A.
    Lu, Zhiyong
    Xu, Hua
    NATURE COMMUNICATIONS, 2025, 16 (01)
  • [14] Benchmarking Large Language Models in Retrieval-Augmented Generation
    Chen, Jiawei
    Lin, Hongyu
    Han, Xianpei
    Sun, Le
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17754 - 17762
  • [15] SEED-Bench: Benchmarking Multimodal Large Language Models
    Li, Bohao
    Ge, Yuying
    Ge, Yixiao
    Wang, Guangzhi
    Wang, Rui
    Zhang, Ruimao
    Shi, Ying
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13299 - 13308
  • [16] Quantifying Bias in Agentic Large Language Models: A Benchmarking Approach
    Fernando, Riya
    Norton, Isabel
    Dogra, Pranay
    Sarnaik, Rohit
    Wazir, Hasan
    Ren, Zitang
    Gunda, Niveta Sree
    Mukhopadhyay, Anushka
    Lutz, Michael
    2024 5TH INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE, ICTC 2024, 2024, : 349 - 353
  • [17] Benchmarking Large Language Models for Log Analysis, Security, and Interpretation
    Karlsen, Egil
    Luo, Xiao
    Zincir-Heywood, Nur
    Heywood, Malcolm
    JOURNAL OF NETWORK AND SYSTEMS MANAGEMENT, 2024, 32 (03)
  • [18] Benchmarking Large Language Models on CFLUE - A Chinese Financial Language Understanding Evaluation Dataset
    Zhu, Jie
    Li, Junhui
    Wen, Yalong
    Guo, Lifan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5673 - 5693
  • [19] Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models
    Tan, Qingyu
    Ng, Hwee Tou
    Bing, Lidong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14820 - 14835
  • [20] MedExpQA: Multilingual benchmarking of Large Language Models for Medical Question Answering
    Alonso, Inigo
    Oronoz, Maite
    Agerri, Rodrigo
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 155