RECONCILING CONTINUOUS ATTRIBUTE VALUES FROM MULTIPLE DATA SOURCES

被引:0
|
作者
Jiang Zhengrui [1 ]
机构
[1] Iowa State Univ, Ames, IA 50011 USA
关键词
Data integration; heterogeneous databases; data heterogeneity; data quality; type I; type II; and misrepresentation errors;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Because of the heterogeneous nature of different data sources, data integration is often one of the most challenging tasks in managing modern information systems. The challenges exist at three different levels: schema heterogeneity, entity heterogeneity, and data heterogeneity. The existing literature has largely focused on schema heterogeneity and entity heterogeneity; and the very limited work on data heterogeneity either avoid attribute value conflicts or resolve them in an ad-hoc manner. The focus of this research is on data heterogeneity. We propose a decision-theoretical framework that enables attribute value conflicts to be resolved in a cost-efficient manner. The framework takes into consideration the consequences of incorrect data values and selects the value that minimizes the total expected error costs for all application problems. Numerical results show that significant savings can be achieved by adopting the proposed framework instead of simply selecting the most likely value or ad-hoc approaches.
引用
收藏
页码:1548 / 1555
页数:8
相关论文
共 50 条
  • [41] Boosting for transfer learning from multiple data sources
    Huang, Pipei
    Wang, Gang
    Qin, Shiyin
    PATTERN RECOGNITION LETTERS, 2012, 33 (05) : 568 - 579
  • [42] Accessible Routes Integrating Data from Multiple Sources
    Luaces, Miguel R.
    Fisteus, Jesus A.
    Sanchez-Fernandez, Luis
    Munoz-Organero, Mario
    Balado, Jesus
    Diaz-Vilarino, Lucia
    Lorenzo, Henrique
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (01)
  • [43] On the Design of Autonomous Agents From Multiple Data Sources
    Garrabe, Emiland
    Russo, Giovanni
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 698 - 703
  • [44] IoT streaming data integration from multiple sources
    Tu, Doan Quang
    Kayes, A. S. M.
    Rahayu, Wenny
    Nguyen, Kinh
    COMPUTING, 2020, 102 (10) : 2299 - 2329
  • [45] Predicting Student Performance from Multiple Data Sources
    Koprinska, Irena
    Stretton, Joshua
    Yacef, Kalina
    ARTIFICIAL INTELLIGENCE IN EDUCATION, AIED 2015, 2015, 9112 : 678 - 681
  • [46] Mining Credit Interest Rate Data from Multiple Data Sources
    Hryhorkiv, Vasyl
    Buiak, Lesia
    Verstiak, Andrii
    Hryhorkiv, Mariia
    Verstiak, Oksana
    Berdnuk, Andrii
    2019 9TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER INFORMATION TECHNOLOGIES (ACIT'2019), 2019, : 265 - 268
  • [47] Semantic Deep Web: Automatic Attribute Extraction from the Deep Web Data Sources
    An, Yoo Jung
    Geller, James
    Wu, Yi-Ta
    Chun, Soon Ae
    APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 1667 - 1672
  • [48] ON THE UNKNOWN ATTRIBUTE VALUES IN LEARNING FROM EXAMPLES
    GRZYMALABUSSE, JW
    LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, 1991, 542 : 368 - 377
  • [49] Learning from examples with unspecified attribute values
    Goldman, SA
    Kwek, SS
    Scott, SD
    INFORMATION AND COMPUTATION, 2003, 180 (02) : 82 - 100
  • [50] Multisensor multiple-attribute data association
    Jing, J
    Jing, G
    Fei, LP
    Sheng, LF
    Kong, SZ
    ICR '96 - 1996 CIE INTERNATIONAL CONFERENCE OF RADAR, PROCEEDINGS, 1996, : 393 - 396