Data preparation for KDD through automatic reasoning based on description logic

被引:11
|
作者
Lara, Juan A. [1 ]
Lizcano, David [1 ]
Aurora Martinez, Ma [1 ]
Pazos, Juan [2 ]
机构
[1] Univ Distancia Madrid, Fac Ensenanzas Tecn, Madrid 28400, Spain
[2] Univ Politecn Madrid, Sch Comp Sci, E-28660 Madrid, Spain
关键词
KDD; Data preparation; Data mining; Description logic; Automatic reasoning; PROTOTYPE REDUCTION SCHEMES; TIME-SERIES;
D O I
10.1016/j.is.2014.03.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Without data preparation, data mining algorithms cannot operate on data within the knowledge discovery in databases (KDD) process. In fact, the success of later KDD phases largely depends on the data preparation stage. The use of mechanisms for automatically preparing data saves a lot of time and resources within the KDD process. These resources will then be available for use at later, less automatable stages, for example, during results interpretation. We have proposed a general-purpose mechanism applicable to multiple domains in order to improve the data preparation phase in the KDD process. This mechanism processes and automatically converts input data to a suitable format for the application of different data preparation techniques based on a known syntax. It is based on the use of description logic Taking a generic UML2 data model as a reference, this mechanism is able to check whether any XML data source whatsoever can be transformed and modelled as a subsumption or instance of the above UML2 model. Thus it automatically identifies a consistent, non-ambiguous and finite set of XLST transformations which are used to prepare the data for the application of data mining techniques, obviating the need to expend resources on the preliminary preparation and formatting stage. The proposed mechanism was applied on structurally complex data from four different domains. In order to test the validity of the proposal, we have applied data mining techniques to extract knowledge from the prepared data. The sound results of applying our proposal to several different domains confirm that it is applicable to any XML data source, as well as being correct, computationally efficient and saving time during the data preparation phase. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:54 / 72
页数:19
相关论文
共 50 条
  • [41] Study and formalization of a Case-Based-Reasoning system using a description logic
    Salotti, S
    Ventos, V
    ADVANCES IN CASE-BASED REASONING, 1998, 1488 : 286 - 297
  • [42] Formalizing partial matching and similarity in case-based reasoning with a description logic
    Coupey, P
    Fouquere, C
    Salotti, S
    APPLIED ARTIFICIAL INTELLIGENCE, 1998, 12 (01) : 71 - 112
  • [43] Research on Extended Reasoning Algorithm of Concept Retrieval Model Based on Description Logic
    Qiu, Jiangnan
    Zhong, Qiuyan
    Wang, Lei
    Wang, Daidi
    2008 INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, VOLS 1-4, 2008, : 549 - 554
  • [44] Combining Event Calculus and Description Logic Reasoning via Logic Programming
    Baumgartner, Peter
    FRONTIERS OF COMBINING SYSTEMS (FROCOS 2021), 2021, 12941 : 98 - 117
  • [45] Automatic Generation of Mediated Schemas Through Reasoning Over Data Dependencies
    Li, Xiang
    Quix, Christoph
    Kensche, David
    Geisler, Sandra
    Guo, Lisong
    IEEE 27TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2011), 2011, : 1280 - 1283
  • [46] Description logic-based automatic generation of geometric tolerance zones
    Yuchu Qin
    Wenlong Lu
    Xiaojun Liu
    Meifa Huang
    Liping Zhou
    Xiangqian Jiang
    The International Journal of Advanced Manufacturing Technology, 2015, 79 : 1221 - 1237
  • [47] Automatic generation of tolerance types based on geometric tolerance description logic
    Qin, Yu-Chu
    Zhong, Yan-Ru
    Chang, Liang
    Huang, Mei-Fa
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2013, 19 (07): : 1490 - 1499
  • [48] Description logic-based automatic generation of geometric tolerance zones
    Qin, Yuchu
    Lu, Wenlong
    Liu, Xiaojun
    Huang, Meifa
    Zhou, Liping
    Jiang, Xiangqian
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2015, 79 (5-8): : 1221 - 1237
  • [49] Indeterminacy Causal Inductive Automatic Reasoning Mechanism Based on Fuzzy State Description
    Yang Bingru & Tang JingDept. of Computer Science and Engineering
    JournalofSystemsEngineeringandElectronics, 2002, (02) : 64 - 70
  • [50] Fuzzy Description Logic Reasoning Using a Fixpoint Algorithm
    Keller, Uwe
    Heymans, Stijn
    LOGICAL FOUNDATIONS OF COMPUTER SCIENCE, 2009, 5407 : 265 - +