Quality Assurance in Business Intelligence / Data Warehouse
- February 23, 2015
OVERVIEW OF DATA WAREHOUSE
A database is one of the most important assets an organization may own, and the information contained inside that database is by and large significant. Nowadays, it is a basic thing for an association to gather and break down ongoing information for corporate choice making, reporting, data mining and also for inspecting recorded patterns. In order to utilize data for decision making; data is usually gathered from several sources and then transformed and stored into a single database so that an organization can perform complete analysis over data as well as generate reports and dashboards for a more clear analytical view. This model is referred to as the data warehouse.
Data warehouse is where data from different sources is incorporated, transformed and stored. Data in Datawarehouse is basically a non-production data which is mainly used for analyzing and reporting purposes. Business users and other corporates use Datawarehouse to analyze data which help them in designing future strategies and business decisions.
The Data warehouse is intended for inquiry and investigation purposes instead of exchange handling. It generally contains chronicled information obtained from exchange information, and empowers an organization to combine the information from a few sources.
Data warehouse collects data from different sources in different formats. This leads us to the importance of ETL testing. ETL Testing is completely different from database testing as its much more complex. ETL Testing primarily involves testing of data extraction from different sources, testing of Business transformation logic and loading of data into the target tables.
ETL TEST PROCESS IS UNIQUE IN RELATION TO STANDARD TEST PROCESS. HOW?
- Test Objective is to enable customers to make intelligent decisions based on accurate and timely analysis of data.
- Test Focus should be on verification and validation of business transformations applied on the data that helps customer in making decisions.
- Need to verify whether data is moved correctly and count the number of records transferred. It should be the same in both the source and target.
- It should be verified that data have been transformed correctly as per the applied business logic.
- Need to maintain history of data because Datawarehouse typically maintain history of data unlike transactional systems which mainly focus on recent data only.
- For testing there is a need to build and populate analytical cube.
- Testing of reports and dashboard generated involved at the end that is what customers need.
REASONS FOR DW/BI PROGRAMS FAILURE INCLUDE
- Data volume and complexity.
- Data irregularities from divergent information sources.
- Data loss during integration process.
- Data architecture and model not scalable and flexible.
- Business flow is not correct.
- Slow Query Response.
To overcome the problems that can lead to the failure of the project, DW/BI testing is divided into different phases stated below:
- Extraction, Transformation & Load Testing
- OLAP Testing
- Business Intelligence Testing
Challenges in ETL Testing Phase
Is My Data Accurate?
When Data obtaining stage is finished, the following step is identified with Data Cleansing and
Separating. Data Cleansing is the methodology of uprooting undesirable information. In the wake of Cleansing, information set gets to be usable and prepared to be bolstered to next work ranges. The principle testing here is to check the information and guarantee that there are no typographical mistakes or to approve the field values against a known rundown of elements, and to verify that first level of transformation rules are effectively implemented according to need.
Make sure if the expected Transformation taking place?
Data transformation is a process of mapping source-to-destination data using the business transforming logic.
Mappings include 1-1 look-ups; switch cases, DB logics, combinations, truncating, defaulting and null processing. End-to-end testing of data flow is very important.
Need to make sure data in desired targets is successfully loaded:
Data Load phase loads the data to the target that is usually Datawarehouse. Depending on the business requirement; data load can be full or incremental. Loading can take place daily, weekly or monthly; so testing in this phase needs to assure that correct data is loaded in the expected duration.
Data Load in the fact tables and dimensions represents the actual picture of Datawarehouse.
CHALLENGES IN ONLINE ANALYTICAL PROCESSING (OLAP) TESTING PHASE
At the point when ETL stage is finished, information goes into the OLAP. This stage has its own particular testing challenges. OLAP includes relational database. It’s the method of storing data in multidimensional form generally for reporting and analyzing dashboards. In testing there is a need to ensure whether data from Datawarehouse is correctly mapped in OLAP cube according to the business needs or not. Another important testing is schema testing. It’s generally done to ensure that the table and data structure is as per business specifications.
CHALLENGES IN REPORTING & DASHBOARD TESTING PHASE
Reports are tested by verifying the layout as per the shared design mockups. Verification of drilling, sorting and export functions of the reports in the web environment are also done.
Data population against metrics and attributes visible on the report is very important.
Similar testing which involves verification of all the widgets, layout, data population and graph generated analysis is carried out while performing dashboard testing.
NUMEROUS ORGANIZATIONS ARE UTILIZING DATAWAREHOUSE CENTER FOR KEY CHOICES. HENCEFORTH IT BECOMES VERY CRITICAL TO BUILD AN EFFECTIVE DATAWAREHOUSE CENTER AND GUARANTEE ITS UPRIGHTNESS & QUALITY ALL THROUGH.
WHERE ORGANIZATIONS ARE STILL FACING CHALLENGES IN TERMS OF END-TO-END TESTING OF DATAWAREHOUSE SYSTEMS TO ENSURE ITS QUALITY & INTEGRITY.