TransPeakNet for solvent-aware 2D NMR prediction via multi-task pre-training and unsupervised learning

被引:0
|
作者
Li, Yunrui [1 ]
Xu, Hao [2 ]
Kumar, Ambrish [3 ]
Wang, Duo-Sheng [4 ]
Heiss, Christian [3 ]
Azadi, Parastoo [3 ]
Hong, Pengyu [1 ]
机构
[1] Brandeis Univ, Dept Comp Sci, Waltham, MA 02453 USA
[2] Harvard Med Sch, Dept Med, Boston, MA USA
[3] Univ Georgia, Complex Carbohydrate Res Ctr, Athens, GA USA
[4] Boston Coll, Dept Chem, Chestnut Hill, MA USA
来源
COMMUNICATIONS CHEMISTRY | 2025年 / 8卷 / 01期
基金
美国国家科学基金会;
关键词
CHEMICAL-SHIFTS; NEURAL-NETWORK; ASSIGNMENT; C-13; H-1; COMPUTATION; DATABASE; HOSE;
D O I
10.1038/s42004-025-01455-9
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Nuclear Magnetic Resonance (NMR) spectroscopy is essential for revealing molecular structure, electronic environment, and dynamics. Accurate NMR shift prediction allows researchers to validate structures by comparing predicted and observed shifts. While Machine Learning (ML) has improved one-dimensional (1D) NMR shift prediction, predicting 2D NMR remains challenging due to limited annotated data. To address this, we introduce an unsupervised training framework for predicting cross-peaks in 2D NMR, specifically Heteronuclear Single Quantum Coherence (HSQC). Our approach pretrains an ML model on an annotated 1D dataset of 1H and 13C shifts, then finetunes it in an unsupervised manner using unlabeled HSQC data, which simultaneously generates cross-peak annotations. Our model also adjusts for solvent effects. Evaluation on 479 expert-annotated HSQC spectra demonstrates our model's superiority over traditional methods (ChemDraw and Mestrenova), achieving Mean Absolute Errors (MAEs) of 2.05 ppm and 0.165 ppm for 13C shifts and 1H shifts respectively. Our algorithmic annotations show a 95.21% concordance with experts' assignments, underscoring the approach's potential for structural elucidation in fields like organic chemistry, pharmaceuticals, and natural products.
引用
收藏
页数:10
相关论文
共 32 条
  • [31] AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via Multi-Agent Multi-Task Reinforcement Learning
    Parvini, Mohammad
    Javan, Mohammad Reza
    Mokari, Nader
    Abbasi, Bijan
    Jorswieck, Eduard A.
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (08) : 9880 - 9896
  • [32] Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification
    Cai, Zhiyuan
    Lin, Li
    He, Huaqing
    Tang, Xiaoying
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VIII, 2022, 13438 : 88 - 98