A family of experiments to validate measures for UML activity diagrams of ETL processes in data warehouses
Date
11/01/201011/01/2010
Author
Muñoz, Lilia
Mazón, Jose Norberto
Trujillo, Juan
Metadata
Show full item recordAbstract
In data warehousing, Extract, Transform, and Load (ETL) processes are in charge of extracting the data from the data sources that will be contained in the data warehouse. Their design and maintenance is thus a cornerstone in any data warehouse development project. Due to their relevance, the quality of these processes should be formally assessed early in the development in order to avoid populating the data warehouse with incorrect data. To this end, this paper presents a set of measures with which to evaluate the structural complexity of ETL process models at the conceptual level. This study is, moreover, accompanied by the application of formal frameworks and a family of experiments whose aim is to theoretical and empirically validate the proposed measures, respectively. Our experiments show that the use of these measures can aid designers to predict the effort associated with the maintenance tasks of ETL processes and to make ETL process models more usable. Our work is based on Unified Modeling Language (UML) activity diagrams for modeling ETL processes, and on the Framework for the Modeling and Evaluation of Software Processes (FMESP) framework for the definition and validation of the measures.
Collections
Related items
Showing items related by title, author, creator and subject.
-
ETL Process Modeling Conceptual for Data Warehouses: A Systematic Mapping Study
Muñoz, Lilia; Mazón, Jose Norberto; Juan, Trujillo (06/16/2011)BACKGROUND: A data warehouse (DW) is an integrated collection of subject-oriented data in the support of decision making. Importantly, the integration of data sources is achieved through the use of ETL (Extract, Transform, ... -
Measures for ETL processes models in data warehouses
Muñoz, Lilia; Mazón, Jose Norberto; Trujillo, Juan (11/06/2009)processes take charge of extracting the data from data sources that would be contained in the data warehouse. Due to their relevance, the quality of these processes should be formally assessed since the early stages of ... -
Systematic review and comparison of modeling ETL processes in data warehouse
Muñoz, Lilia; Mazón, Jose Norberto; Trujillo, Juan (08/23/2010)Abstract: In a Data Warehouse (DW), ETL processes (Extraction, Transformation, Load) are responsible for extracting, transforming and loading data from the data sources into the DW. A good design of these processes in the ...