RECONCILING CONTINUOUS ATTRIBUTE VALUES FROM MULTIPLE DATA SOURCES

被引:0
|
作者
Jiang Zhengrui [1 ]
机构
[1] Iowa State Univ, Ames, IA 50011 USA
关键词
Data integration; heterogeneous databases; data heterogeneity; data quality; type I; type II; and misrepresentation errors;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Because of the heterogeneous nature of different data sources, data integration is often one of the most challenging tasks in managing modern information systems. The challenges exist at three different levels: schema heterogeneity, entity heterogeneity, and data heterogeneity. The existing literature has largely focused on schema heterogeneity and entity heterogeneity; and the very limited work on data heterogeneity either avoid attribute value conflicts or resolve them in an ad-hoc manner. The focus of this research is on data heterogeneity. We propose a decision-theoretical framework that enables attribute value conflicts to be resolved in a cost-efficient manner. The framework takes into consideration the consequences of incorrect data values and selects the value that minimizes the total expected error costs for all application problems. Numerical results show that significant savings can be achieved by adopting the proposed framework instead of simply selecting the most likely value or ad-hoc approaches.
引用
收藏
页码:1548 / 1555
页数:8
相关论文
共 50 条
  • [1] A framework for reconciling attribute values from multiple data sources
    Jiang, Zhengrui
    Sarkar, Sumit
    De, Prabuddha
    Dey, Debabrata
    MANAGEMENT SCIENCE, 2007, 53 (12) : 1946 - 1963
  • [2] Attribute-based semantic reconciliation of multiple data sources
    Parsons, J
    Wand, Y
    JOURNAL ON DATA SEMANTICS I, 2003, 2800 : 21 - 47
  • [3] Reconciling tuple and attribute timestamping for temporal data warehouses
    Ahmed, Waqas
    Gomez, Leticia
    Vaisman, Alejandro
    Zimanyi, Esteban
    VLDB JOURNAL, 2025, 34 (01):
  • [4] Multiple Sources Geographic Attribute Data Uncertainty and Information Fusion Schemes
    Yi, Shanzhen
    Tang, Zhongqian
    Xiao, Yangfan
    2017 25TH INTERNATIONAL CONFERENCE ON GEOINFORMATICS, 2017,
  • [5] Replacing missing values using trustworthy data values from web data sources
    Jaya, M. Izham
    Sidi, Fatimah
    Yusof, Sharmila Mat
    Affendey, Lilly Suriani
    Ishak, Iskandar
    Jabar, Marzanah A.
    6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL MATHEMATICS (ICCSCM 2017), 2017, 892
  • [6] Certifying data from multiple sources
    Nuckolls, G
    Martel, C
    Stubblebine, SG
    DATA AND APPLICATIONS SECURITY XVII: STATUS AND PROSPECTS, 2004, 142 : 47 - 60
  • [7] Gather data from multiple sources
    Hill, J
    JOURNAL OF FAMILY PRACTICE, 2004, 53 (05): : 416 - 416
  • [8] Ontology Augmentation via Attribute Extraction from Multiple Types of Sources
    Fang, Xiu Susie
    Wang, Xianzhi
    Sheng, Quan Z.
    DATABASES THEORY AND APPLICATIONS, 2015, 9093 : 16 - 27
  • [9] Scaling Data from Multiple Sources
    Enamorado, Ted
    Lopez-Moctezuma, Gabriel
    Ratkovic, Marc
    POLITICAL ANALYSIS, 2021, 29 (02) : 212 - 235
  • [10] Use of multiple classifiers in classification of data from multiple data sources
    Briem, GJ
    Benediktsson, JA
    Sveinsson, JR
    IGARSS 2001: SCANNING THE PRESENT AND RESOLVING THE FUTURE, VOLS 1-7, PROCEEDINGS, 2001, : 882 - 884