A Lightweight Approach to Extract Interschema Properties from Structured, Semi-Structured and Unstructured Sources in a Big Data Scenario

被引:5
|
作者
Cauteruccio, Francesco [1 ]
Lo Giudice, Paolo [2 ]
Musarella, Lorenzo [2 ]
Terracina, Giorgio [1 ]
Ursino, Domenico [3 ]
Virgili, Luca [3 ]
机构
[1] Univ Calabria, Dipartimento Matemat & Informat, I-87036 Arcavacata Di Rende, CS, Italy
[2] Univ Mediterranea Reggio Calabria, Dipartimento Ingn Informaz Infrastrutture & Energ, Via Univ,25 Gia Salita Melissari, I-89124 Reggio Di Calabria, CF, Italy
[3] Univ Politecn Marche, Dipartimento Ingn Informaz, Via Brecce Bianche 12, I-60131 Ancona, Italy
关键词
Unstructured sources; interschema property derivation; structuring unstructured data; big data; METADATA QUALITY; DIGITAL REPOSITORIES; SIMILARITY; CLASSIFICATION; CONSTRUCTION; INTEGRATION; SYSTEM; MODEL; DIKE;
D O I
10.1142/S0219622020500182
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The knowledge of interschema properties (e.g., synonymies, homonymies, hyponymies and subschema similarities) plays a key role for allowing decision-making in sources characterized by disparate formats. In the past, wide amount and variety of approaches to derive interschema properties from structured and semi-structured data have been proposed. However, currently, it is esteemed that more than 80% of data sources are unstructured. Furthermore, the number of sources generally involved in an interaction is much higher than in the past. As a consequence, the necessity arises of new approaches to address the interschema property derivation issue in this new scenario. In this paper, we aim at providing a contribution in this setting by proposing an approach capable of uniformly extracting interschema properties from a huge number of structured, semi-structured and unstructured sources.
引用
收藏
页码:849 / 889
页数:41
相关论文
共 50 条
  • [21] Semi-Structured Data Model for Big Data (SS-DMBD)
    Hamouda, Shady
    Zainol, Zurinahni
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2019, : 348 - 356
  • [22] Extracting information from semi-structured Internet sources
    Jeong, JS
    Oh, DI
    ISIE 2001: IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS PROCEEDINGS, VOLS I-III, 2001, : 1378 - 1381
  • [23] An ontology-based approach to designing a NoSQL database for semi-structured and unstructured health data
    Poly Sil Sen
    Nandini Mukherjee
    Cluster Computing, 2024, 27 : 959 - 976
  • [24] An ontology-based approach to designing a NoSQL database for semi-structured and unstructured health data
    Sen, Poly Sil
    Mukherjee, Nandini
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 959 - 976
  • [25] Extracting information from semi-structured internet sources
    Div. of Info. Tech. Eng., College of Engineering, SoonChunHyang University, Asan, Korea, Republic of
    IEEE Int Symp Ind Electron, (1378-1381):
  • [26] A Semantic Layer on Semi-structured Data Sources for Intuitive Chatbots
    Augello, Agnese
    Vassallo, Giorgio
    Gaglio, Salvatore
    Pilato, Giovanni
    CISIS: 2009 INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS, VOLS 1 AND 2, 2009, : 760 - +
  • [27] Semi-automatic Knowledge Extraction from Semi-structured and Unstructured Data Within the OMAHA Project
    Reuss, Pascal
    Althoff, Klaus-Dieter
    Henkel, Wolfram
    Pfeiffer, Matthias
    Hankel, Oliver
    Pick, Roland
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, ICCBR 2015, 2015, 9343 : 336 - 350
  • [28] Data Warehouse Based Approach to the Integration of Semi-structured Data
    Ahmad, Houda
    Kermanshahani, Shokoh
    Simonet, Ana
    Simonet, Michel
    ADVANCES IN WEB AND NETWORK TECHNOLOGIES, AND INFORMATION MANAGEMENT, 2009, 5731 : 88 - 99
  • [29] Similarity-Based Classification for Big Non-Structured and Semi-Structured Recipe Data
    Chen, Wei
    Zhao, Xiangyu
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2016, 2016, 9645 : 57 - 64
  • [30] Multilingual Food and Heath Ontology Learning Using Semi-Structured and Structured Web Data Sources
    Albukhitan, Saeed
    Helmy, Tarek
    2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY WORKSHOPS (WI-IAT WORKSHOPS 2012), VOL 3, 2012, : 231 - 235