Data flow modeling, data mining and QSAR in high-throughput discovery of functional nanomaterials

被引:25
|
作者
Yang, Yang [1 ]
Lin, Tian [1 ]
Weng, Xiao L. [2 ]
Darr, Jawwad A. [2 ]
Wang, Xue Z. [1 ]
机构
[1] Univ Leeds, Inst Particle Sci & Engn, Sch Proc Environm & Mat Engn, Leeds LS2 9JT, W Yorkshire, England
[2] UCL, Dept Chem, London WC1H 0AJ, England
基金
英国工程与自然科学研究理事会;
关键词
Data mining; QSAR; Design of experiments; Genetic algorithm; Nanoparticle; High-throughput; PROCESS OPERATIONAL DATA; CONTINUOUS HYDROTHERMAL SYNTHESIS; CEO2-ZRO2 MIXED OXIDES; SOLID-SOLUTIONS; DECISION TREES; ECOTOXICITY DATA; CERIA; NANOPARTICLES; CATALYSTS; COMBINATORIAL;
D O I
10.1016/j.compchemeng.2010.04.018
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Metal oxide nanoparticles are promising materials in applications for fuel cells, gas sensors and fine chemical catalysis. Their functionality depends excessively on composition, structure as well as synthesis and processing conditions. Continuous hydrothermal flow synthesis (CHFS) reactors are an effective technology to make nanoceramics. In order to increase sample throughput of CHFS, a manual high-throughput continuous hydrothermal (HiTCH) flow synthesis process capable of formulating scores of samples per day was developed. More recently, a fully automated nanoceramics synthesis platform called RAMSI (rapid automated synthesis instrument) based on the HiTCH synthesis technology was developed. When large numbers of nanoceramics are made and formulated into appropriate libraries, automated analytical instruments can be used to allow collection of a large amount of useful data. This paper describes the information flow management system of RAMSI (as well as CHFS) and the data mining system for supporting discovery, QSAR (quantitative structure-activity relationship) modeling and DoE (design of experiments). Case studies demonstrating the use of the high-throughput data mining system are presented. These include clustering of Raman spectra, interpretation of X-ray diffraction (XRD) measurements, and QSAR model building linking XRD data and photocatalytic properties. A genetic algorithm method for DoE is also presented that can guide the experiments to search optimal XRD patterns. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:671 / 678
页数:8
相关论文
共 50 条
  • [41] A novel method for mining highly imbalanced high-throughput screening data in PubChem
    Li, Qingliang
    Wang, Yanli
    Bryant, Stephen H.
    BIOINFORMATICS, 2009, 25 (24) : 3310 - 3316
  • [42] A high-throughput data analysis and materials discovery tool for strongly correlated materials
    Hafiz, Hasnain
    Khair, Adnan Ibne
    Choi, Hongchul
    Mueen, Abdullah
    Bansil, Arun
    Eidenbenz, Stephan
    Wills, John
    Zhu, Jian-Xin
    Balatsky, Alexander V.
    Ahmed, Towfiq
    NPJ COMPUTATIONAL MATERIALS, 2018, 4
  • [43] A Bayesian method for biological pathway discovery from high-throughput experimental data
    Wang, W
    Cooper, GF
    2004 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, 2004, : 645 - 646
  • [44] DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data
    Saad, Chadi
    Noe, Laurent
    Richard, Hugues
    Leclerc, Julie
    Buisine, Marie-Pierre
    Touzet, Helene
    Figeac, Martin
    BMC BIOINFORMATICS, 2018, 19
  • [45] Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology
    Hardcastle, Thomas J.
    BIOINFORMATICS, 2016, 32 (02) : 195 - 202
  • [46] DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data
    Chadi Saad
    Laurent Noé
    Hugues Richard
    Julie Leclerc
    Marie-Pierre Buisine
    Hélène Touzet
    Martin Figeac
    BMC Bioinformatics, 19
  • [47] A high-throughput data analysis and materials discovery tool for strongly correlated materials
    Hasnain Hafiz
    Adnan Ibne Khair
    Hongchul Choi
    Abdullah Mueen
    Arun Bansil
    Stephan Eidenbenz
    John Wills
    Jian-Xin Zhu
    Alexander V. Balatsky
    Towfiq Ahmed
    npj Computational Materials, 4
  • [48] Identification of functional modules using network topology and high-throughput data
    Ulitsky, Igor
    Shamir, Ron
    BMC SYSTEMS BIOLOGY, 2007, 1
  • [49] Celebrating a decade of neuroscience databasesLooking to the future of high-throughput data analysis, data integration, and discovery neuroscience
    Stephen H. Koslow
    Michael D. Hirsch
    Neuroinformatics, 2004, 2 : 267 - 269
  • [50] hipFG: high-throughput harmonization and integration pipeline for functional genomics data
    Cifello, Jeffrey
    Kuksa, Pavel P.
    Saravanan, Naveensri
    Valladares, Otto
    Wang, Li-San
    Leung, Yuk Yee
    BIOINFORMATICS, 2023, 39 (11)