Computational method for discovery of biomarker signatures from large, complex data sets

被引：2

作者：

Makarov, Vladimir ^{[1
,2
]}

Gorlin, Alex ^{[2
]}

机构：

[1] Calif State Univ Channel Isl, Camarillo, CA 93012 USA

[2] IFXworks LLC, 2915 Columbia Pike, Arlingtion, VA 22204 USA

来源：

COMPUTATIONAL BIOLOGY AND CHEMISTRY | 2018年 / 76卷

关键词：

Biomarker; Microarray; Gene expression; Chemical; Classification; TRANSLATIONAL BIOINFORMATICS; SELECTION; CLASSIFICATION;

D O I：

10.1016/j.compbiolchem.2018.07.008

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

We present an efficient method for identifying of reliable biomarker panels from large multivariate data sets that typically result from experiments that monitor changes in RNA, small molecule, or protein abundance. Our computational methodology is developed and validated on the toxicogenomics database Drug Matrix that in its largest category contains 1656 recognition targets, characterized by the toxicant, dose and time (or duration) of the exposure. We were able to recognize both individual experimental conditions (compound, dose and time combinations) and the cases where the values for dose and time variables fall within the intervals in the training data, but do not match the training data exactly. Inclusion of gene expression information for multiple organs improved accuracy of recognition. Inclusion of time response information into consideration allowed us to develop particularly accurate marker panels for a large number of targets: we were able to recognize 176 compounds (out of 316) at greater than 90% accuracy. The presented methodology has an immediate application for discovery of diagnostic biomarker panels for exposure to various toxicity hazards, and may also be useful for development of biological markers for medical applications.

引用

页码：161 / 168

页数：8

共 50 条

[1] FlowCT for the analysis of large immunophenotypic data sets and biomarker discovery in cancer immunology
Botta, Cirino
Maia, Catarina
Garces, Juan-Jose
Termini, Rosalinda
Perez, Cristina
Manrique, Irene
Burgos, Leire
Zabaleta, Aintzane
Alignani, Diego
Sarvide, Sarai
Merino, Juana
Puig, Noemi
Cedena, Maria-Teresa
Rossi, Marco
Tassone, Pierfrancesco
Gentile, Massimo
Correale, Pierpaolo
Borrello, Ivan
Terpos, Evangelos
Jelinek, Tomas
Paiva, Artur
Roccaro, Aldo
Goldschmidt, Hartmut
Avet-Loiseau, Herve
Rosinol, Laura
Mateos, Maria-Victoria
Martinez-Lopez, Joaquin
Lahuerta, Juan-Jose
Blade, Joan
San-Miguel, Jesus F.
Paiva, Bruno
BLOOD ADVANCES, 2022, 6 (02) : 690 - 703
[2] Knowledge Discovery in Large Data Sets
Simas, Tiago
Silva, Gabriel
Miranda, Bruno
Moitinho, Andre
Ribeiro, Rita
CLASSIFICATION AND DISCOVERY IN LARGE ASTRONOMICAL SURVEYS, 2008, 1082 : 196 - +
[3] The process of knowledge discovery from large pharmacokinetic data sets
Effe, EI
Williams, P
Sun, H
Fadiran, E
Ajayi, FO
Onyiah, LC
JOURNAL OF CLINICAL PHARMACOLOGY, 2001, 41 (01): : 25 - 34
[4] Efficient Discovery of Confounders in Large Data Sets
Zhou, Wenjun
Xiong, Hui
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 647 - 656
[5] An efficient method for discovery of large item sets
Deshpande, Deepa S.
INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2016, 8 (04) : 303 - 314
[6] Computational Derivation of Structural Alerts from Large Toxicology Data Sets
Ahlberg, Ernst
Carlsson, Lars
Boyer, Scott
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2014, 54 (10) : 2945 - 2952
[7] Learning complex classification models from large data sets
Center, Julian L., Jr.
BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, 2006, 872 : 227 - 234
[8] Big data and false discovery: analyses of bibliometric indicators from large data sets
Gangan Prathap
Scientometrics, 2014, 98 : 1421 - 1422
[9] Big data and false discovery: analyses of bibliometric indicators from large data sets
Prathap, Gangan
SCIENTOMETRICS, 2014, 98 (02) : 1421 - 1422
[10] STATISTICAL APPROACHES FOR DEALING WITH COMPLEX BIOMARKER DATA SETS
Gibbons, Robert
SCHIZOPHRENIA BULLETIN, 2015, 41 : S18 - S18

← 1 2 3 4 5 →