Data Fusion-based Discovery (DAFdiscovery) pipeline to aid compound annotation and bioactive compound discovery across diverse spectral data

被引:9
|
作者
Borges, Ricardo Moreira [1 ]
Costa, Fernanda das Neves [1 ]
Chagas, Fernanda O. [1 ]
Teixeira, Andrew Magno [1 ]
Yoon, Jaewon [2 ]
Weiss, Marcio Barczyszyn [2 ]
Crnkovic, Camila Manoel [2 ]
Pilon, Alan Cesar [3 ]
Garrido, Bruno C. [4 ]
Betancur, Luz Adriana [5 ]
Forero, Abel M. [6 ,7 ,8 ]
Castellanos, Leonardo [6 ]
Ramos, Freddy A. [6 ]
Pupo, Monica T. [3 ]
Kuhn, Stefan [9 ]
机构
[1] Univ Fed Rio de Janeiro, Inst Pesquisas Prod Nat Walter Mors, Rio De Janeiro, Brazil
[2] Univ Sao Paulo, Fac Ciencias Farmaceut, Sao Paulo, Brazil
[3] Univ Sao Paulo, Fac Ciencias Farmaceut Ribeirao Preto, Sao Paulo, Brazil
[4] Organ Anal Lab, Chem Metrol Div, Inmetro, Brazil
[5] Univ Caldas, Dept Quim, Edificio Orlando Sierra, Caldas, Colombia
[6] Univ Nacl Colombia, Dept Quim, Sede Bogota, Bogota, Colombia
[7] Univ A Coruna, Dept Quim, Fac Ciencias, Coruna, Spain
[8] Univ A Coruna, Ctr Invest Cient Avanzadas CI CA, Coruna, Spain
[9] De Montfort Univ, Sch Comp Sci & Informat, Leicester, Leics, England
基金
巴西圣保罗研究基金会;
关键词
NMR;
D O I
10.1002/pca.3178
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Introduction Data Fusion-based Discovery (DAFdiscovery) is a pipeline designed to help users combine mass spectrometry (MS), nuclear magnetic resonance (NMR), and bioactivity data in a notebook-based application to accelerate annotation and discovery of bioactive compounds. It applies Statistical Total Correlation Spectroscopy (STOCSY) and Statistical HeteroSpectroscopy (SHY) calculation in their data using an easy-to-follow Jupyter Notebook. Method Different case studies are presented for benchmarking, and the resultant outputs are shown to aid natural products identification and discovery. The goal is to encourage users to acquire MS and NMR data from their samples (in replicated samples and fractions when available) and to explore their variance to highlight MS features, NMR peaks, and bioactivity that might be correlated to accelerated bioactive compound discovery or for annotation-identification studies. Results Different applications were demonstrated using data from different research groups, and it was shown that DAFdiscovery reproduced their findings using a more straightforward method. Conclusion DAFdiscovery has proven to be a simple-to-use method for different situations where data from different sources are required to be analyzed together.
引用
收藏
页码:48 / 55
页数:8
相关论文
共 42 条
  • [1] Deep Transferable Compound Representation across Domains and Tasks for Low Data Drug Discovery
    Abbasi, Karim
    Poso, Antti
    Ghasemi, Jahanbakhsh
    Amanlou, Massoud
    Masoudi-Nejad, Ali
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (11) : 4528 - 4539
  • [2] Chemocentric informatics: Enabling bioactive compound discovery through structural hypothesis fusion
    Tropsha, Alexander
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2010, 240
  • [3] Data annotation based on scenario in chance discovery process
    Iwase, Y
    Takama, Y
    PROCEEDINGS OF THE 8TH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1-3, 2005, : 1797 - 1800
  • [4] Exploring available compound data with the open PHACTS discovery platform and KNIME
    Digles, Daniela
    Ecker, Gerhard
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 252
  • [5] Public-Private Partnerships: Compound and Data Sharing in Drug Discovery and Development
    Davis, Andrew M.
    Engkvist, Ola
    Fairclough, Rebecca J.
    Feierberg, Isabella
    Freeman, Adrian
    Iyer, Preeti
    SLAS DISCOVERY, 2021, 26 (05) : 604 - 619
  • [6] Big Data and New Drug Discovery: Tackling "Big Data" for Virtual Screening of Large Compound Databases
    Basak, Subhash C.
    Vracko, Marjan
    Bhattacharjee, Apurba K.
    CURRENT COMPUTER-AIDED DRUG DESIGN, 2015, 11 (03) : 197 - 201
  • [7] CMDBENCH: A Benchmark for Coarse-to-fine Multimodal Data Discovery in Compound AI Systems
    Feng, Yanlin
    Rahman, Sajjadur
    Feng, Aaron
    Chen, Vincent
    Kandogan, Eser
    FIRST WORKSHOP ON GOVERNANCE, UNDERSTANDING, AND INTEGRATION OF DATA FOR EFFECTIVE AND RESPONSIBLE AI, GUIDE-AI 2024, 2024, : 16 - 25
  • [8] FINDING COMPOUND PROPERTY PATTERNS IN EMPIRICAL DATA SETS USING DISCOVERY METHODS.
    Jackson, A. G.
    Kiselyova, N.
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 1996, 52 : C340 - C340
  • [9] Data fusion in passive acoustic locating system based on compound arrays
    Fang, LQ
    Shi, GC
    Guo, DQ
    Zhang, N
    ICEMI 2005: Conference Proceedings of the Seventh International Conference on Electronic Measurement & Instruments, Vol 6, 2005, : 123 - 126
  • [10] metID: an R package for automatable compound annotation for LC-MS-based data
    Shen, Xiaotao
    Wu, Si
    Liang, Liang
    Chen, Songjie
    Contrepois, Kevin
    Zhu, Zheng-Jiang
    Snyder, Michael
    BIOINFORMATICS, 2022, 38 (02) : 568 - 569