TransPeakNet for solvent-aware 2D NMR prediction via multi-task pre-training and unsupervised learning

被引:0
|
作者
Li, Yunrui [1 ]
Xu, Hao [2 ]
Kumar, Ambrish [3 ]
Wang, Duo-Sheng [4 ]
Heiss, Christian [3 ]
Azadi, Parastoo [3 ]
Hong, Pengyu [1 ]
机构
[1] Brandeis Univ, Dept Comp Sci, Waltham, MA 02453 USA
[2] Harvard Med Sch, Dept Med, Boston, MA USA
[3] Univ Georgia, Complex Carbohydrate Res Ctr, Athens, GA USA
[4] Boston Coll, Dept Chem, Chestnut Hill, MA USA
来源
COMMUNICATIONS CHEMISTRY | 2025年 / 8卷 / 01期
基金
美国国家科学基金会;
关键词
CHEMICAL-SHIFTS; NEURAL-NETWORK; ASSIGNMENT; C-13; H-1; COMPUTATION; DATABASE; HOSE;
D O I
10.1038/s42004-025-01455-9
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Nuclear Magnetic Resonance (NMR) spectroscopy is essential for revealing molecular structure, electronic environment, and dynamics. Accurate NMR shift prediction allows researchers to validate structures by comparing predicted and observed shifts. While Machine Learning (ML) has improved one-dimensional (1D) NMR shift prediction, predicting 2D NMR remains challenging due to limited annotated data. To address this, we introduce an unsupervised training framework for predicting cross-peaks in 2D NMR, specifically Heteronuclear Single Quantum Coherence (HSQC). Our approach pretrains an ML model on an annotated 1D dataset of 1H and 13C shifts, then finetunes it in an unsupervised manner using unlabeled HSQC data, which simultaneously generates cross-peak annotations. Our model also adjusts for solvent effects. Evaluation on 479 expert-annotated HSQC spectra demonstrates our model's superiority over traditional methods (ChemDraw and Mestrenova), achieving Mean Absolute Errors (MAEs) of 2.05 ppm and 0.165 ppm for 13C shifts and 1H shifts respectively. Our algorithmic annotations show a 95.21% concordance with experts' assignments, underscoring the approach's potential for structural elucidation in fields like organic chemistry, pharmaceuticals, and natural products.
引用
收藏
页数:10
相关论文
共 32 条
  • [1] Improving News Recommendation via Bottlenecked Multi-task Pre-training
    Xiao, Xiongfeng
    Li, Qing
    Liu, Songlin
    Zhou, Kun
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2082 - 2086
  • [2] CLMSM: A Multi-Task Learning Framework for Pre-training on Procedural Text
    Nandy, Abhilash
    Kapadnis, Manav Nitin
    Goyal, Pawan
    Ganguly, Niloy
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8793 - 8806
  • [3] Improving Transformer-based Speech Recognition with Unsupervised Pre-training and Multi-task Semantic Knowledge Learning
    Li, Song
    Li, Lin
    Hong, Qingyang
    Liu, Lingling
    INTERSPEECH 2020, 2020, : 5006 - 5010
  • [4] Early Recurrence Prediction of Hepatocellular Carcinoma Using Deep Learning Frameworks with Multi-Task Pre-Training
    Song, Jian
    Dong, Haohua
    Chen, Youwen
    Zhang, Xianru
    Zhan, Gan
    Jain, Rahul Kumar
    Chen, Yen-Wei
    INFORMATION, 2024, 15 (08)
  • [5] Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
    Zhang, Yu
    Cheng, Hao
    Shen, Zhihong
    Liu, Xiaodong
    Wang, Ye-Yi
    Gao, Jianfeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12259 - 12275
  • [6] Multi-task bioassay pre-training for protein-ligand binding affinity prediction
    Yan, Jiaxian
    Ye, Zhaofeng
    Yang, Ziyi
    Lu, Chengqiang
    Zhang, Shengyu
    Liu, Qi
    Qiu, Jiezhong
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (01)
  • [7] Cross-dataset transfer learning for motor imagery signal classification via multi-task learning and pre-training
    Xie, Yuting
    Wang, Kun
    Meng, Jiayuan
    Yue, Jin
    Meng, Lin
    Yi, Weibo
    Jung, Tzyy-Ping
    Xu, Minpeng
    Ming, Dong
    JOURNAL OF NEURAL ENGINEERING, 2023, 20 (05)
  • [8] Learning general multi-agent decision model through multi-task pre-training
    Wang, Jiawei
    Xu, Lele
    Sun, Changyin
    NEUROCOMPUTING, 2025, 627
  • [9] Multi-task Pre-training with Soft Biometrics for Transfer-learning Palmprint Recognition
    Xu, Huanhuan
    Leng, Lu
    Yang, Ziyuan
    Teoh, Andrew Beng Jin
    Jin, Zhe
    NEURAL PROCESSING LETTERS, 2023, 55 (03) : 2341 - 2358
  • [10] Multi-task Pre-training with Soft Biometrics for Transfer-learning Palmprint Recognition
    Huanhuan Xu
    Lu Leng
    Ziyuan Yang
    Andrew Beng Jin Teoh
    Zhe Jin
    Neural Processing Letters, 2023, 55 : 2341 - 2358