A hybrid and exploratory approach to knowledge discovery in metabolomic data

被引:9
|
作者
Grissa, Dhouha [1 ,4 ]
Comte, Blandine [1 ]
Petera, Melanie [2 ]
Pujos-Guillot, Estelle [1 ]
Napoli, Amedeo [3 ]
机构
[1] Univ Clermont Auvergne, INRA, UNH, Mapping, F-63000 Clermont Ferrand, France
[2] Univ Clermont Auvergne, INRA, UNH, Plateforme Explorat Metab,MetaboHUB Clermont, F-63000 Clermont Ferrand, France
[3] Univ Lorraine, CNRS, INRIA, LORIA, F-54000 Nancy, France
[4] Univ Copenhagen, Novo Nordisk Fdn, Ctr Prot Res, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark
关键词
Hybrid knowledge discovery; Pattern mining; Formal concept analysis; Data and pattern exploration; Metabolomic data; Classification; Visualization; Interpretation; FORMAL CONCEPT ANALYSIS; FEATURE-SELECTION;
D O I
10.1016/j.dam.2018.11.025
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, we propose a hybrid and exploratory knowledge discovery approach for analyzing metabolomic complex data based on a combination of supervised classifiers, pattern mining and Formal Concept Analysis (FCA). The approach is based on three main operations, preprocessing, classification, and postprocessing. Classifiers are applied to datasets of the form individuals x features and produce sets of ranked features which are further analyzed. Pattern mining and FCA are used to provide a complementary analysis and support for visualization. A practical application of this framework is presented in the context of metabolomic data, where two interrelated problems are considered, discrimination and prediction of class membership. The dataset is characterized by a small set of individuals and a large set of features, in which predictive biomarkers of clinical outcomes should be identified. The problems of combining numerical and symbolic data mining methods, as well as discrimination and prediction, are detailed and discussed. Moreover, it appears that visualization based on FCA can be used both for guiding knowledge discovery and for interpretation by domain analysts. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:103 / 116
页数:14
相关论文
共 50 条
  • [31] A unified multilingual and multimedia data mining approach for cancer knowledge discovery
    Lee, Chung-Hong
    Wu, Chih-Hong
    Chung, Hsiang-Hang
    Yang, Hsin-Chang
    2007 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, VOL II, PROCEEDINGS, 2007, : 241 - +
  • [32] Prov-Dominoes: An approach for knowledge discovery from provenance data
    Alencar, Victor
    Kohwalter, Troy
    Braganholo, Vanessa
    Da Silva Junior, Jose Ricardo
    Murta, Leonardo
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245
  • [33] K-optimal pattern discovery: An efficient and effective approach to exploratory data mining
    Webb, GI
    AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 1 - 2
  • [34] A systems approach to knowledge discovery
    Reed, KL
    MINING AND MODELING MASSIVE DATA SETS IN SCIENCE, ENGINEERING, AND BUSINESS WITH A SUBTHEME IN ENVIRONMENTAL STATISTICS, 1997, 29 (01): : 105 - 105
  • [35] Knowledge discovery in scientific data
    Rudolph, S
    DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS, AND TECHNOLOGY II, 2000, 4057 : 250 - 258
  • [36] Knowledge Discovery in Data Science
    Grady, Nancy W.
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1603 - 1608
  • [37] Data Science and Knowledge Discovery
    Portela, Filipe
    FUTURE INTERNET, 2021, 13 (07):
  • [38] Knowledge discovery from data?
    Pazzani, MJ
    IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 2000, 15 (02): : 10 - 13
  • [39] Knowledge discovery in data warehouses
    Palpanas, T
    SIGMOD RECORD, 2000, 29 (03) : 88 - 100
  • [40] Knowledge discovery in data warehouses
    Palpanas, Themistoklis
    SIGMOD Record (ACM Special Interest Group on Management of Data), 2000, 29 (03): : 88 - 100