Empirical studies to assess the understandability of data warehouse schemas using structural metrics

被引:24
|
作者
Serrano, Manuel Angel [1 ]
Calero, Coral [1 ]
Sahraoui, Houari A. [1 ,2 ]
Piattini, Mario [1 ]
机构
[1] Univ Castilla La Mancha, Dept Informat Technol & Syst, Alarcos Res Grp, E-13071 Ciudad Real, Spain
[2] Univ Montreal, Dept Informat & Rech Operat, Montreal, PQ H3C 3J7, Canada
关键词
data warehouse; quality; metrics; empirical studies;
D O I
10.1007/s11219-007-9030-7
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Data warehouses are powerful tools for making better and faster decisions in organizations where information is an asset of primary importance. Due to the complexity of data warehouses, metrics and procedures are required to continuously assure their quality. This article describes an empirical study and a replication aimed at investigating the use of structural metrics as indicators of the understandability, and by extension, the cognitive complexity of data warehouse schemas. More specifically, a four-step analysis is conducted: (1) check if individually and collectively, the considered metrics can be correlated with schema understandability using classical statistical techniques, (2) evaluate whether understandability can be predicted by case similarity using the case-based reasoning technique, (3) determine, for each level of understandability, the subsets of metrics that are important by means of a classification technique, and assess, by means of a probabilistic technique, the degree of participation of each metric in the understandability prediction. The results obtained show that although a linear model is a good approximation of the relation between structure and understandability, the associated coefficients are not significant enough. Additionally, classification analyses reveal respectively that prediction can be achieved by considering structure similarity, that extracted classification rules can be used to estimate the magnitude of understandability, and that some metrics such as the number of fact tables have more impact than others.
引用
收藏
页码:79 / 106
页数:28
相关论文
共 50 条