Ontology-based conceptual design of ETL processes for both structured and semi-structured data

被引:62
|
作者
Skoutas, Dimitricis [1 ]
Simitsis, Alkis [1 ]
机构
[1] Natl Tech Univ Athens, Dept Elect & Comp Engn, GR-10682 Athens, Greece
关键词
conceptual design; data semantics; data warehousing; ETL; ontology; semantic matching; workflow diagram;
D O I
10.4018/jswis.2007100101
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the main tasks in the early stages of a data warehouse project is the identification of the appropriate transformations and the specification of inter-schema mappings from the data sources to the data warehouse. In this article, we propose an ontology-based approach to facilitate the conceptual design of the backstage of a data warehouse. A graph-based representation is used as a conceptual model for the datastores, so that both structured and semi-structured data are supported and handled in a uniform way. The proposed approach is based on the use of Semantic Web technologies to semantically annotate the data sources and the data warehouse, so that mappings between them can be inferred, thereby resolving the issue of heterogeneity Specifically, a suitable application Ontology is created and used to annotate the datastores. The language used for describing the ontology is OWL-DL. Based on the provided annotations, a DL reasoner is employed to infer semantic correspondences and conflicts among the datastores, and to propose a set of conceptual operations for transforming data from the source datastores to the data warehouse.
引用
收藏
页码:1 / 24
页数:24
相关论文
共 50 条
  • [1] Incremental Ontology-Based Extraction and Alignment in Semi-structured Documents
    Thiam, Mouhamadou
    Bennacer, Nacera
    Pernelle, Nathalie
    Lo, Moussa
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2009, 5690 : 611 - +
  • [2] An ontology-based approach to designing a NoSQL database for semi-structured and unstructured health data
    Sen, Poly Sil
    Mukherjee, Nandini
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 959 - 976
  • [3] An ontology-based approach to designing a NoSQL database for semi-structured and unstructured health data
    Poly Sil Sen
    Nandini Mukherjee
    Cluster Computing, 2024, 27 : 959 - 976
  • [4] Conceptual Graphs Based Modeling of Semi-structured Data
    Varga, Viorica
    Sacarea, Christian
    Molnar, Andrea Eva
    GRAPH-BASED REPRESENTATION AND REASONING (ICCS 2018), 2018, 10872 : 167 - 175
  • [5] Towards category-based fuzzy querying of both structured and semi-structured imprecise data
    Buche, P
    Haemmerlé, O
    FLEXIBLE QUERY ANSWERING SYSTEMS: RECENT ADVANCES, 2001, : 362 - 375
  • [6] Rationale in Semi-structured Processes
    Kannengiesser, Udo
    Zhu, Liming
    BUSINESS PROCESS MANAGEMENT WORKSHOPS, 2011, 66 : 634 - +
  • [7] Keyword Search on Structured and Semi-Structured Data
    Chen, Yi
    Wang, Wei
    Liu, Ziyang
    Lin, Xuemin
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 1005 - 1009
  • [8] WICCAO: From semi-structured data to structured data
    Li, Z
    Ng, WK
    11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOP ON THE ENGINEERING OF COMPUTER-BASED SYSTEMS, PROCEEDINGS, 2004, : 86 - 93
  • [9] Knowledge discovery from semi-structured data for conceptual organization
    Gupta, S.
    Goyal, R.
    Shubham, K.
    Dey, L.
    Malik, A.
    Chaudhury, S.
    Bhattacharya, S.
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WORKSHOPS PROCEEDINGS, 2006, : 291 - +
  • [10] Ontology Construction from Semi-Structured Text
    Zhou, Kuanjiu
    Wang, Lei
    Qiu, Peng
    2008 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-31, 2008, : 10936 - 10939