EnzymeMap: curation, validation and data-driven prediction of enzymatic reactions

被引:7
|
作者
Heid, Esther [1 ,2 ]
Probst, Daniel [3 ]
Green, William H. [2 ]
Madsen, Georg K. H. [1 ]
机构
[1] TU Wien, Inst Mat Chem, A-1060 Vienna, Austria
[2] MIT, Dept Chem Engn, Cambridge, MA 02139 USA
[3] IBM Res Europe, CH-8803 Ruschlikon, Switzerland
基金
奥地利科学基金会;
关键词
BIOCATALYSIS; CASCADE; RETROSYNTHESIS; RESOURCE; OUTCOMES; DESIGN; TOOL;
D O I
10.1039/d3sc02048g
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Enzymatic reactions are an ecofriendly, selective, and versatile addition, sometimes even alternative to organic reactions for the synthesis of chemical compounds such as pharmaceuticals or fine chemicals. To identify suitable reactions, computational models to predict the activity of enzymes on non-native substrates, to perform retrosynthetic pathway searches, or to predict the outcomes of reactions including regio- and stereoselectivity are becoming increasingly important. However, current approaches are substantially hindered by the limited amount of available data, especially if balanced and atom mapped reactions are needed and if the models feature machine learning components. We therefore constructed a high-quality dataset (EnzymeMap) by developing a large set of correction and validation algorithms for recorded reactions in the literature and showcase its significant positive impact on machine learning models of retrosynthesis, forward prediction, and regioselectivity prediction, outperforming previous approaches by a large margin. Our dataset allows for deep learning models of enzymatic reactions with unprecedented accuracy, and is freely available online. New curation and atom-mapping routine leading to large database of enzymatic reactions boosts performance of deep learning models.
引用
收藏
页码:14229 / 14242
页数:14
相关论文
共 50 条
  • [21] Data-driven prediction of adverse drug reactions induced by drug-drug interactions
    Liu, Ruifeng
    AbdulHameed, Mohamed Diwan M.
    Kumar, Kamal
    Yu, Xueping
    Wallqvist, Anders
    Reifman, Jaques
    BMC PHARMACOLOGY & TOXICOLOGY, 2017, 18
  • [22] Data-driven prediction of adverse drug reactions induced by drug-drug interactions
    Ruifeng Liu
    Mohamed Diwan M. AbdulHameed
    Kamal Kumar
    Xueping Yu
    Anders Wallqvist
    Jaques Reifman
    BMC Pharmacology and Toxicology, 18
  • [23] Data-driven control by using data-driven prediction and LASSO for FIR typed inverse controller
    Suzuki, Motoya
    Kaneko, Osamu
    ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2023, 106 (03)
  • [24] Data-Driven Control by using Data-Driven Prediction and LASSO for FIR Typed Inverse Controller
    Suzuki M.
    Kaneko O.
    IEEJ Transactions on Electronics, Information and Systems, 2023, 143 (03) : 266 - 275
  • [25] Data-Driven Sewer Pipe Data Random Generation and Validation
    Yin, Xianfei
    Bouferguene, Ahmed
    Al-Hussein, Mohamed
    JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2020, 146 (12)
  • [26] A Review of Data-Driven Building Energy Prediction
    Liu, Huiheng
    Liang, Jinrui
    Liu, Yanchen
    Wu, Huijun
    BUILDINGS, 2023, 13 (02)
  • [27] Online Data-Driven Battery Voltage Prediction
    Pajovic, Milutin
    Sahinoglu, Zafer
    Wang, Yebin
    Orlik, Philip V.
    Wada, Toshihiro
    2017 IEEE 15TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2017, : 827 - 834
  • [28] Prediction of engine demand with a data-driven approach
    Francis, Hudson
    Kusiak, Andrew
    XII INTERNATIONAL SYMPOSIUM INTELLIGENT SYSTEMS 2016, (INTELS 2016), 2017, 103 : 28 - 35
  • [29] ANALYSIS OF DATA-DRIVEN INTERNAL MULTIPLE PREDICTION
    Ramifrez, Adriana Citlali
    JOURNAL OF SEISMIC EXPLORATION, 2013, 22 (02): : 105 - 128
  • [30] A Data-Driven Model For Wildfire Prediction in California
    Hahs, Brennon
    Sood, Kanika
    Gomez, Desiree
    2024 INTERNATIONAL CONFERENCE ON SMART APPLICATIONS, COMMUNICATIONS AND NETWORKING, SMARTNETS-2024, 2024,