RECONCILING CONTINUOUS ATTRIBUTE VALUES FROM MULTIPLE DATA SOURCES

被引:0
|
作者
Jiang Zhengrui [1 ]
机构
[1] Iowa State Univ, Ames, IA 50011 USA
关键词
Data integration; heterogeneous databases; data heterogeneity; data quality; type I; type II; and misrepresentation errors;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Because of the heterogeneous nature of different data sources, data integration is often one of the most challenging tasks in managing modern information systems. The challenges exist at three different levels: schema heterogeneity, entity heterogeneity, and data heterogeneity. The existing literature has largely focused on schema heterogeneity and entity heterogeneity; and the very limited work on data heterogeneity either avoid attribute value conflicts or resolve them in an ad-hoc manner. The focus of this research is on data heterogeneity. We propose a decision-theoretical framework that enables attribute value conflicts to be resolved in a cost-efficient manner. The framework takes into consideration the consequences of incorrect data values and selects the value that minimizes the total expected error costs for all application problems. Numerical results show that significant savings can be achieved by adopting the proposed framework instead of simply selecting the most likely value or ad-hoc approaches.
引用
收藏
页码:1548 / 1555
页数:8
相关论文
共 50 条
  • [31] Attribute-based evaluation of multiple continuous queries for filtering incoming tuples of a data stream
    Lee, Hyun-Ho
    Yun, Eun-Won
    Lee, Won-Suk
    INFORMATION SCIENCES, 2008, 178 (11) : 2416 - 2432
  • [32] CLUSTERING CATEGORICAL DATA BASED ON COMBINATIONS OF ATTRIBUTE VALUES
    Do, Hee-Jung
    Kim, Jae Yearn
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (12A): : 4393 - 4405
  • [33] Granulating data on non-scalar attribute values
    Mazlack, L
    Coppock, S
    PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOL 1 & 2, 2002, : 944 - 949
  • [34] Rough set strategies to data with missing attribute values
    Grzymala-Busse, JW
    FOUNDATIONS AND NOVEL APPROACHES IN DATA MINING, 2006, 9 : 197 - 212
  • [35] Selection of Semantical Mapping of Attribute Values for Data Integration
    Szymczak, Marcin
    Bronselaer, Antoon
    Zadrozny, Slawomir
    De Tre, Guy
    INTELLIGENT SYSTEMS'2014, VOL 1: MATHEMATICAL FOUNDATIONS, THEORY, ANALYSES, 2015, 322 : 581 - 592
  • [36] A rough set approach to data with missing attribute values
    Grzymala-Busse, Jerzy W.
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, PROCEEDINGS, 2006, 4062 : 58 - 67
  • [37] Reconciling tuple and attribute timestamping for temporal data warehousesReconciling tuple and attribute timestamping for temporal data warehousesW. Ahmed et al.
    Waqas Ahmed
    Leticia Gómez
    Alejandro Vaisman
    Esteban Zimányi
    The VLDB Journal, 2025, 34 (1)
  • [38] IoT streaming data integration from multiple sources
    Doan Quang Tu
    A. S. M. Kayes
    Wenny Rahayu
    Kinh Nguyen
    Computing, 2020, 102 : 2299 - 2329
  • [39] FITTING SMOOTHING SPLINES TO DATA FROM MULTIPLE SOURCES
    GAO, F
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1994, 23 (06) : 1665 - 1698
  • [40] Combining data from multiple sources: A cautionary tale
    Robinson, Chris A.
    Terhune, Claire E.
    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2016, 159 : 270 - 270