Data boundaries and data quality in a BI Platform

Many problems can arise when a BI Platform becomes the system of record for various data elements. The platform can suffer from missing data, stale data or incorrect results.

Definitions:

  • BI Platform: the combination of components required to deliver reports: Data Warehouse, Data Marts, and the Reporting Application.
  • System of Record: the authoritative data source for a given data element
  • Missing data, stale data, and incorrect results: Poor data quality that renders the data unfit for the intended use.

Ideally, the BI platform should consume nearly all data and should rarely become the system of record for maintaining data. However, to get the BI Platform launched with meaningful reports in a reasonable amount of time, it frequently becomes the system of record for one or more types of data.

Common types of data/situations where a BI Platform becomes the system of record:

Data mapping between systems
When the data used to connect two or more systems is inconsistent. For example, “United Kingdom” in one system and “UK” in another system must be connected to ensure alignment to the conformed dimension.

Master Taxonomy Data
Dimension hierarchies (i.e. Geography) that are not mastered by a single system can be stored in a clean format. Additionally, custom groupings of attributes can be added to provide more options.

Business Rules
Metric Definitions and logic for calculating KPIs, related indicator thresholds, targets, custom personnel information, and other business data that is small in scope and does not change often.

Application Data
If a user interacts with data and it is saved back to the data warehouse, it is application data. The BI reporting system must transition from a reporting tool to an application.

Given the high frequency of these data situations, it is important that a BI Architect has well defined methods to ensure effective data management.

Best Practice #1: Keep logically organized and divided responsibilities within the BI platform

  • Data mapping between systems à ETL Component
  • Master Taxonomy Data à Data Warehouse Component
  • Business Rules  à Data Mart Component
  • Application Data à BI Application Component

Best Practice #2: Review, rationalize, and limit the amount of data managed
Maintaining data in the reporting system can become a bad habit and can easily grow until it’s out of control.

  • Search for other systems that can be the system of record from the start
  • Require a data owner and maintenance plan
  • Expose the data in the reporting environment so users can see it
  • Do not create unnecessary data sets in the reporting database
  • Keep it small and simple

Best Practice #3: Follow standards for robust data quality
It’s important to realize that BI developers can be lazy when the BI system becomes the system of record for certain items. It’s easy to overlook the following IT like rules for keeping track of data in a BI system:

  • Keep track of created date, updated date
  • Keep track of created by user, updated by user
  • Keep the data organized with naming conventions

Best Practice #4: Transition the data management responsibilities to new systems
The best long term solution for a BI reporting platform is to transition the data management responsibilities to new systems. Data management can be optimized by using a targeting system, a master data management system, or a by integrating the data into an existing application.

Evan Schmidt

Published on 06/11/2013

Authored by admin