Data Fusion-based Discovery (DAFdiscovery) pipeline to aid compound annotation and bioactive compound discovery across diverse spectral data

被引:9
|
作者
Borges, Ricardo Moreira [1 ]
Costa, Fernanda das Neves [1 ]
Chagas, Fernanda O. [1 ]
Teixeira, Andrew Magno [1 ]
Yoon, Jaewon [2 ]
Weiss, Marcio Barczyszyn [2 ]
Crnkovic, Camila Manoel [2 ]
Pilon, Alan Cesar [3 ]
Garrido, Bruno C. [4 ]
Betancur, Luz Adriana [5 ]
Forero, Abel M. [6 ,7 ,8 ]
Castellanos, Leonardo [6 ]
Ramos, Freddy A. [6 ]
Pupo, Monica T. [3 ]
Kuhn, Stefan [9 ]
机构
[1] Univ Fed Rio de Janeiro, Inst Pesquisas Prod Nat Walter Mors, Rio De Janeiro, Brazil
[2] Univ Sao Paulo, Fac Ciencias Farmaceut, Sao Paulo, Brazil
[3] Univ Sao Paulo, Fac Ciencias Farmaceut Ribeirao Preto, Sao Paulo, Brazil
[4] Organ Anal Lab, Chem Metrol Div, Inmetro, Brazil
[5] Univ Caldas, Dept Quim, Edificio Orlando Sierra, Caldas, Colombia
[6] Univ Nacl Colombia, Dept Quim, Sede Bogota, Bogota, Colombia
[7] Univ A Coruna, Dept Quim, Fac Ciencias, Coruna, Spain
[8] Univ A Coruna, Ctr Invest Cient Avanzadas CI CA, Coruna, Spain
[9] De Montfort Univ, Sch Comp Sci & Informat, Leicester, Leics, England
基金
巴西圣保罗研究基金会;
关键词
NMR;
D O I
10.1002/pca.3178
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Introduction Data Fusion-based Discovery (DAFdiscovery) is a pipeline designed to help users combine mass spectrometry (MS), nuclear magnetic resonance (NMR), and bioactivity data in a notebook-based application to accelerate annotation and discovery of bioactive compounds. It applies Statistical Total Correlation Spectroscopy (STOCSY) and Statistical HeteroSpectroscopy (SHY) calculation in their data using an easy-to-follow Jupyter Notebook. Method Different case studies are presented for benchmarking, and the resultant outputs are shown to aid natural products identification and discovery. The goal is to encourage users to acquire MS and NMR data from their samples (in replicated samples and fractions when available) and to explore their variance to highlight MS features, NMR peaks, and bioactivity that might be correlated to accelerated bioactive compound discovery or for annotation-identification studies. Results Different applications were demonstrated using data from different research groups, and it was shown that DAFdiscovery reproduced their findings using a more straightforward method. Conclusion DAFdiscovery has proven to be a simple-to-use method for different situations where data from different sources are required to be analyzed together.
引用
收藏
页码:48 / 55
页数:8
相关论文
共 42 条
  • [21] Community discovery method based on complex network of data fusion based on the super network perspective
    Pei L.
    International Journal of Computers and Applications, 2021, 43 (04): : 383 - 390
  • [22] Large-Scale Compartmental Model-Based Study of Preclinical Pharmacokinetic Data and Its Impact on Compound Triaging in Drug Discovery
    Zhang, Peter Zhiping
    Ballard, Jeanine
    Fagiani, Facundo Esquivel
    Smith, Dustin
    Gibson, Christopher
    Yu, Xiang
    MOLECULAR PHARMACEUTICS, 2025, 22 (03) : 1230 - 1240
  • [23] General melting point prediction based on a diverse compound data set and artificial neural networks
    Karthikeyan, M
    Glen, RC
    Bender, A
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2005, 45 (03) : 581 - 590
  • [24] Automatic discovery of regulatory patterns in promoter regions based on whole cell expression data and functional annotation
    Jensen, LJ
    Knudsen, S
    BIOINFORMATICS, 2000, 16 (04) : 326 - 333
  • [25] Statistical Analysis and Discovery of Heterogeneous Catalysts Based on Machine Learning from Diverse Published Data
    Suzuki, Keisuke
    Toyao, Takashi
    Maeno, Zen
    Takakusagi, Satoru
    Shimizu, Ken-ichi
    Takigawa, Ichigaku
    CHEMCATCHEM, 2019, 11 (18) : 4537 - 4547
  • [26] JWES: a new pipeline for whole genome/exome sequence data processing, management, and gene-variant discovery, annotation, prediction, and genotyping
    Ahmed, Zeeshan
    Renart, Eduard Gibert
    Mishra, Deepshikha
    Zeeshan, Saman
    FEBS OPEN BIO, 2021, 11 (09): : 2441 - 2452
  • [27] A similarity-based data-fusion approach to the visual characterization and comparison of compound databases
    Medina-Franco, Jose L.
    Maggiora, Gerald M.
    Giulianotti, Marc A.
    Pinilla, Clemencia
    Houghten, Richard A.
    CHEMICAL BIOLOGY & DRUG DESIGN, 2007, 70 (05) : 393 - 412
  • [28] Compound Positioning Method for Connected Electric Vehicles Based on Multi-Source Data Fusion
    Wang, Lin
    Li, Zhenhua
    Fan, Qinglan
    SUSTAINABILITY, 2022, 14 (14)
  • [29] Joint Sparsity Based Heterogeneous Data-Level Fusion for Multi-Target Discovery
    Niu, Ruixin
    Zulch, Peter
    Distasio, Marcello
    Blasch, Erik
    Chen, Genshe
    Shen, Dan
    Wang, Zhonghai
    Lu, Jingyang
    2018 IEEE AEROSPACE CONFERENCE, 2018,
  • [30] NaCTR: Natural product-derived compound-based drug discovery pipeline from traditional oriental medicine by search space reduction
    Jung, Seunghwan
    Kim, Kwansoo
    Wang, Seunghyun
    Han, Manyoung
    Lee, Doheon
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2024, 23 : 3869 - 3877