A FRAMEWORK FOR DATA CLEANING IN DATA WAREHOUSES

被引:0
|
作者
Peng, Taoxin [1 ]
机构
[1] Napier Univ, Sch Comp, Edinburgh EH10 5DT, Midlothian, Scotland
关键词
Data Cleaning; Data Quality; Data Integration; Data Warehousing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is a persistent challenge to achieve a high quality of data in data warehouses. Data cleaning is a crucial task for such a challenge. To deal with this challenge, a set of methods and tools has been developed. However, there are still at least two questions needed to be answered: How to improve the efficiency while performing data cleaning? How to improve the degree of automation when performing data cleaning? This paper challenges these two questions by presenting a novel framework, which provides an approach to managing data cleaning in data warehouses by focusing on the use of data quality dimensions, and decoupling a cleaning process into several sub-processes. Initial test run of the processes in the framework demonstrates that the approach presented is efficient and scalable for data cleaning in data warehouses.
引用
收藏
页码:473 / 478
页数:6
相关论文
共 50 条
  • [1] A conceptual framework for clinical data warehouses
    Garcia, A
    Xéxeo, G
    Sampaio, R
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2001, : 910 - 910
  • [2] A framework for developing enterprise data warehouses
    Murtaza, AH
    INFORMATION SYSTEMS MANAGEMENT, 1998, 15 (04) : 21 - 26
  • [3] Spatial data warehouses:: a methodological framework
    Boussaïd, O
    Aufaure, MA
    ADVANCES IN SPATIAL ANALYSIS AND DECISION MAKING, 2004, 1 : 275 - 282
  • [4] A Framework for Designing Autonomous Parallel Data Warehouses
    Benkrid, Soumia
    Bellatreche, Ladjel
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2019, PT II, 2020, 11945 : 97 - 104
  • [5] A framework for mining association rules in data warehouses
    Tjioe, HC
    Taniar, D
    INTELLIGENT DAA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 159 - 165
  • [6] A conceptual framework for building and testing of Data Warehouses
    Simoes, Dora
    SISTEMAS Y TECNOLOGIAS DE INFORMACION, 2010, : 23 - 28
  • [7] Framework for Generalization and Improvement of Relational Data Warehouses
    Velinov, Goran
    Gligoroski, Danilo
    Popovska, Margita Kon
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (03): : 32 - 40
  • [8] A framework for multidimensional design of data warehouses from ontologies
    Romero, Oscar
    Abello, Alberto
    DATA & KNOWLEDGE ENGINEERING, 2010, 69 (11) : 1138 - 1157
  • [9] A Framework for Emulating Database Operations in Cloud Data Warehouses
    Soliman, Mohamed A.
    Antova, Lyublena
    Sugiyama, Marc
    Duller, Michael
    Aleyasen, Amirhossein
    Mitra, Gourab
    Abdelhamid, Ehab
    Morcos, Mark
    Gage, Michele
    Korablev, Dmitri
    Waas, Florian M.
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 1447 - 1461
  • [10] Cleaning Framework for BigData - AN INTERACTIVE APPROACH FOR DATA CLEANING
    Liu, Hong
    Kumar, Ashwin T. K.
    Thomas, Johnson P.
    Hou, Xiaofei
    PROCEEDINGS 2016 IEEE SECOND INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (BIGDATASERVICE 2016), 2016, : 174 - 181