Data preparation for KDD through automatic reasoning based on description logic

被引:11
|
作者
Lara, Juan A. [1 ]
Lizcano, David [1 ]
Aurora Martinez, Ma [1 ]
Pazos, Juan [2 ]
机构
[1] Univ Distancia Madrid, Fac Ensenanzas Tecn, Madrid 28400, Spain
[2] Univ Politecn Madrid, Sch Comp Sci, E-28660 Madrid, Spain
关键词
KDD; Data preparation; Data mining; Description logic; Automatic reasoning; PROTOTYPE REDUCTION SCHEMES; TIME-SERIES;
D O I
10.1016/j.is.2014.03.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Without data preparation, data mining algorithms cannot operate on data within the knowledge discovery in databases (KDD) process. In fact, the success of later KDD phases largely depends on the data preparation stage. The use of mechanisms for automatically preparing data saves a lot of time and resources within the KDD process. These resources will then be available for use at later, less automatable stages, for example, during results interpretation. We have proposed a general-purpose mechanism applicable to multiple domains in order to improve the data preparation phase in the KDD process. This mechanism processes and automatically converts input data to a suitable format for the application of different data preparation techniques based on a known syntax. It is based on the use of description logic Taking a generic UML2 data model as a reference, this mechanism is able to check whether any XML data source whatsoever can be transformed and modelled as a subsumption or instance of the above UML2 model. Thus it automatically identifies a consistent, non-ambiguous and finite set of XLST transformations which are used to prepare the data for the application of data mining techniques, obviating the need to expend resources on the preliminary preparation and formatting stage. The proposed mechanism was applied on structurally complex data from four different domains. In order to test the validity of the proposal, we have applied data mining techniques to extract knowledge from the prepared data. The sound results of applying our proposal to several different domains confirm that it is applicable to any XML data source, as well as being correct, computationally efficient and saving time during the data preparation phase. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:54 / 72
页数:19
相关论文
共 50 条
  • [31] Bounded model checking with description logic reasoning
    Ben-David, Shoham
    Trefler, Richard
    Weddell, Grant
    AUTOMATED REASONING WITH ANALYTIC TABLEAUX AND RELATED METHODS, PROCEEDINGS, 2007, 4548 : 60 - +
  • [32] A diagrammatic reasoning system for the description logic ALC
    Dau, Frithjof
    Eklund, Peter
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2008, 19 (05): : 539 - 573
  • [33] DESCRIPTION AND REASONING OF VLSI CIRCUIT IN TEMPORAL LOGIC
    FUSAOKA, A
    SEKI, H
    TAKAHASHI, K
    NEW GENERATION COMPUTING, 1984, 2 (01) : 79 - 90
  • [34] Reasoning in Description Logic Ontologies for Privacy Management
    Nuradiansyah, Adrian
    KUNSTLICHE INTELLIGENZ, 2020, 34 (03): : 411 - 415
  • [35] A Parameterized Complexity View on Description Logic Reasoning
    de Haan, Ronald
    SIXTEENTH INTERNATIONAL CONFERENCE ON PRINCIPLES OF KNOWLEDGE REPRESENTATION AND REASONING, 2018, : 359 - 368
  • [36] Automatic image description based on textual data
    Badr, Youakim
    Chbeir, Richard
    JOURNAL ON DATA SEMANTICS VII, 2006, 4244 : 196 - 218
  • [37] A LOGIC FOR DATA DESCRIPTION
    ARHANGELSKY, DA
    TAITSLIN, MA
    LECTURE NOTES IN COMPUTER SCIENCE, 1989, 363 : 2 - 11
  • [38] Automated end user-centred adaptation of web components through automated description logic-based reasoning
    Lizcano, David
    Alonso, Fernando
    Soriano, Javier
    Lopez, Genoveva
    INFORMATION AND SOFTWARE TECHNOLOGY, 2015, 57 : 446 - 462
  • [39] A Tableau Algorithm for Paraconsistent and Nonmonotonic Reasoning in Description Logic-Based System
    Zhang, Xiaowang
    Lin, Zuoquan
    Wang, Kewen
    WEB TECHNOLOGIES AND APPLICATIONS, 2011, 6612 : 345 - +
  • [40] Error-Tolerant Reasoning in the Description Logic εL Based on Optimal Repairs
    Baader, Franz
    Kriegel, Francesco
    Nuradiansyah, Adrian
    RULES AND REASONING, RULEML+RR 2022, 2022, 13752 : 227 - 243