Adding open spectral data to MassBank and PubChem using open source tools to support non-targeted exposomics of mixtures

被引:8
|
作者
Elapavalore, Anjana [1 ]
Kondic, Todor [1 ]
Singh, Randolph R. [1 ,2 ]
Shoemaker, Benjamin A. [3 ]
Thiessen, Paul A. [3 ]
Zhang, Jian [3 ]
Bolton, Evan E. [3 ]
Schymanski, Emma L. [1 ]
机构
[1] Univ Luxembourg, Luxembourg Ctr Syst Biomed LCSB, 6 Ave Swing, L-4367 Belvaux, Luxembourg
[2] IFREMER Inst Francais Rech Exploitat Mer, Lab Biogeochim Contaminants Organ, Rue Ile Yeu,BP 21105, F-44311 Nantes 3, France
[3] NIH, Natl Ctr Biotechnol Informat NCBI, Natl Lib Med NLM, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
SPECTROMETRY; CHALLENGE; CHEMICALS; MS/MS;
D O I
10.1039/d3em00181d
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The term "exposome" is defined as a comprehensive study of life-course environmental exposures and the associated biological responses. Humans are exposed to many different chemicals, which can pose a major threat to the well-being of humanity. Targeted or non-targeted mass spectrometry techniques are widely used to identify and characterize various environmental stressors when linking exposures to human health. However, identification remains challenging due to the huge chemical space applicable to exposomics, combined with the lack of sufficient relevant entries in spectral libraries. Addressing these challenges requires cheminformatics tools and database resources to share curated open spectral data on chemicals to improve the identification of chemicals in exposomics studies. This article describes efforts to contribute spectra relevant for exposomics to the open mass spectral library MassBank (https://www.massbank.eu) using various open source software efforts, including the R packages RMassBank and Shinyscreen. The experimental spectra were obtained from ten mixtures containing toxicologically relevant chemicals from the US Environmental Protection Agency (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT). Following processing and curation, 5582 spectra from 783 of the 1268 ENTACT compounds were added to MassBank, and through this to other open spectral libraries (e.g., MoNA, GNPS) for community benefit. Additionally, an automated deposition and annotation workflow was developed with PubChem to enable the display of all MassBank mass spectra in PubChem, which is rerun with each MassBank release. The new spectral records have already been used in several studies to increase the confidence in identification in non-target small molecule identification workflows applied to environmental and exposomics research.
引用
收藏
页码:1788 / 1801
页数:14
相关论文
共 44 条
  • [21] Enhanced Non-Traditional Learning Environment for Communication Engineers Using Free Open Source Software Tools
    Maragatharaj, S.
    Rathinakumar, K.
    Kumar, M. Dinesh
    2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON TECHNOLOGY FOR EDUCATION (T4E 2013), 2013, : 101 - 104
  • [22] Framework for 3D data modeling and Web visualization of underground caves using open source tools
    Silvestre, Ivo
    Rodrigues, Jose I.
    Figueiredo, Mauro
    Veiga-Pires, Cristina
    WEB3D 2013: 18TH INTERNATIONAL CONFERENCE ON 3D WEB TECHNOLOGY, 2013, : 121 - 128
  • [23] Distributed chemical computing using ChemStar: An open source Java']Java remote method invocation architecture applied to large scale molecular data from PubChem
    Karthikeyan, M.
    Krishnan, S.
    Pandey, Anil Kumar
    Bender, Andreas
    Tropsha, Alexander
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2008, 48 (04) : 691 - 703
  • [24] Integrating Data-Mining Support into a Brain-Image Database Using Open-Source Components
    Herskovits, E. H.
    Chen, R.
    ADVANCES IN MEDICAL SCIENCES, 2008, 53 (02): : 172 - 181
  • [25] Data extraction and preparation to perform a sentiment analysis using open source tools The example of a Facebook fashion brand page
    Teixeira, Antonio
    Laureano, Raul M. S.
    2017 12TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI), 2017,
  • [26] A Molecular Integration Database System for All: Integration of HIV Clinical and Molecular Data Using Open Source Genome Management Tools
    Lehvaslaiho, Heikki
    Dawe, Adam
    Kamau, Allan
    van Rooyen, Ruby
    de Oliveira, Tulio
    Boardman, Anelda
    Powell, Alan
    Karim, Salim Abdool
    Mlisana, Koleka
    Morris, Lynn
    Gray, Clive
    Williamson, Carolyn
    Hide, Winston
    INFECTION GENETICS AND EVOLUTION, 2009, 9 (03) : 373 - 373
  • [27] GeaVR: An open-source tools package for geological-structural exploration and data collection using immersive virtual reality
    Bonali, Fabio Luca
    Vitello, Fabio
    Kearl, Martin
    Tibaldi, Alessandro
    Whitworth, Malcolm
    Antoniou, Varvara
    Russo, Elena
    Delage, Emmanuel
    Nomikou, Paraskevi
    Becciani, Ugo
    van Wyk de Vries, Benjamin
    Krokos, Mel
    Applied Computing and Geosciences, 2024, 21
  • [28] GeaVR: An open-source tools package for geological-structural exploration and data collection using immersive virtual reality
    Bonali, Fabio Luca
    Vitello, Fabio
    Kearl, Martin
    Tibaldi, Alessandro
    Whitworth, Malcolm
    Antoniou, Varvara
    Russo, Elena
    Delage, Emmanuel
    Nomikou, Paraskevi
    Becciani, Ugo
    de Vries, Benjamin van Wyk
    Krokos, Mel
    APPLIED COMPUTING AND GEOSCIENCES, 2024, 21
  • [30] Exploring the effects of SourceForge.net coordination and communication tools on the efficiency of open source projects using data envelopment analysis
    Stefan Koch
    Empirical Software Engineering, 2009, 14