The Advantage of Good Business Intelligence

February 15, 2009 by: Peter B. Giblett

This is an excerpt from a book in-progress about successful BI implementation and is published here to give the reader clarity about my background…

The aim of most corporations is to run their organisations with minimal cost and use informed decisions in order to give them a competitive advantage. In all too many organisations data and information is not used to its best advantage. They have been leveraging operational data in order make decisions but the process all too often is hit and miss and usually relies upon intensive manual labour to collate, sort, fix and summarise the data used. The process often takes most of the month to complete, with last month’s month end reports being available during the last week of this month.

Ready access to information about an organisation’s business, products and customers is a key element in supporting decision making. The creation of the data warehouse is an essential step in turning operational data into information. It is the Data Warehouse that is at the heart of the Business Intelligence provision.

At the very minimum the Data Warehouse provides a mechanism to collate, sort, validate fix and summarise data automatically, leaving those involved in manual processes to be reassigned to other duties. Computers have always been good at taking laborious monotonous tasks away from humans and doing them repetitively (shhhh! They can also do it much faster than the human can and are much more efficient).

However this is the minimum requirement, the Data Warehouse can provide so much more.

The Data Warehouse Defined

The Data Warehouse needs to be clearly distinguished from operational or on-line transaction processing (OLTP) systems. The readers should be familiar with OLTP systems. The Data Warehouse makes use of data in a different way to OLTP systems.

The Data Warehouse has been defined by Bill Inmon as:

A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision making process.

It was also defined by Ralf Kimball as:

a copy of transaction data specifically structured for query and analysis.

And by Price Waterhouse as:

An orderly and accessible repository of known facts…that is used as the basis for making better management decisions.

It is Subject Oriented because it is classified and organised around subjects meaningful to an organisation. It has collected all data available about a subject from all the operational systems. E.g. all sales, not just those captured via the e-business channel. The data from various systems is integrated into the Data Warehouse. It holds time-variant information and all facts/measurements will be time stamped either through their source transaction, or upon entry to the warehouse. Data can be both historical (e.g. 6 to 12 years old) or nearly current (e.g. yesterday, last week, last month). Data is non-volatile because it does not change in response to current operational activities and is based on a controlled data collection process. The data has not changed as soon as the report is created.

The Data Warehouse takes a copy of transactional data and structures it specifically to allow efficient query and analysis to take place throughout the warehouse in a uniform manner. This process will underpin corporate Business Intelligence (BI).

There are many ways, of course, to copy a transactional database, such as:

  • To allow the management of a distributed database with each branch being responsible for its own data management with head-office systems being updated overnight.
  • Replication to ensure that mission-critical database systems continue to operate in a 24×7 environment with the highest degree of availability possible. E.g. for use as a hot-standby.
  • Replication for taking backups of critical systems, where the backup is completed off-line and subsequently re-synchronised.
  • For inter-corporate communications. E.g. for automated inter-company purchase systems, or where data is sold.

It is clear that simple copies of databases do not amount to the creation of a Data Warehouse. This requires a restructuring process that reorganises data more effectively for analytical purposes and may include aggregation and summarisation of data. The Data Warehouse must not impact the operation or performance of the originating systems. The creation of a separate query and analysis database, which does not impinge on the day-to-day operational activities of the corporation, is one of the key reasons why it is necessary to build a Data Warehouse.

The Warehouse is orderly because it takes data from a number of diverse applications. It is methodically arranged with data being pulled into a structured data repository. This repository is accessible for use via reporting applications which will supply users with facts (or measurements) about their business, which will be used to support the management decision process.

Additionally Data Warehouses can provide the springboard for other management activities, as defined by Richard Tanler:

The data warehouse provides a single source of management information about the past, it should also provide information about the future… Bill Inmon’s original definition… does not specify that the data warehouse is intended to be read only. Bill chose his words carefully in using non-volatile… Maintaining a forecast in the data warehouse requires periodic update of the tables or, more appropriately, creating new tables reflecting a new forecast version.

It is clear that a Data Warehouse can ultimately do much more than simply report on past activities. It can provide the basis for future activities, such as budgeting and forecasting. World banks have, in the wake of September 11th 2001, had to take measures to block money laundering transactions from known terrorists, or their financiers. Statutory bodies have also waded in with demands for greater compliance. The Sarbanes Oxley laws and the Basel II accords are examples of compliance requirements where applications will depend on a Data Warehouse as
their data source.

hub-spoke

Diagram 1 Hub & Spoke Architecture

Ultimately the Data Warehouse is a tool used by business to supply information necessary for informed decision making. Information is distinct from data. Information is data that has been intelligently processed via a Data Warehouse to allow it to be used for key business decision making. These characteristics are common to all data warehouses, but not all Data Warehouses are the same, they simply share common characteristics, both technically and in the business. These characteristics include the desire to use analysis of performance to drive the decision making process and identify areas to improve efficiency and gain a competitive advantage.

It is useful to understand the similarities and differences between OLTP and Data Warehouse Systems this can be understood from Figure 1 that shows the purpose of each database type, how it is organised, the type of access that applies, and the data held in the database.

Figure 2 - Comparing OLTP and the Data Warehouse

Figure 2 - Comparing OLTP and the Data Warehouse

Data Warehouses

Data Warehouses are capable of holding granular and historic data. In the Data Warehouse context granular data is the lowest level of detail retailed within the warehouse. Granularity is discussed in detail in later chapters. Historic data is held in a data warehouse for two primary reasons:

  1. To build a deeper understanding of the functioning of the corporation over a long time span. This may include the ability to forecast future events of the basis of past history, or
  2. To meet statutory demands.

Data Warehouses can become eternally large. Until the advent of the Warehouse the terms terabyte and petabyte had no practical usage. They were term only known to academics and research scientists. Data Warehouse adoption is leading corporations to adopt large servers with the capability to store very large volumes of data for long periods of time. Normally historic data is summarised according to business rules. E.g. monthly Sales
Volumes are retained, not every order and order line. The amount of historic data that must be kept will depend on specific business requirements.

It is important to understand that the Data Warehouse is NOT an operational archive. It is specifically formatted for business intelligence, and may not retain all values present in operational systems. It retains a history, but the purpose of this is to facilitate comparative analysis over time. It should be the primary source for all business intelligence activities within the organisation. The Data within the Warehouse is an important corporate resource that must be effectively managed.

When the Data Warehouse should be used for Reporting

Whilst there are many demands for reports to be produced within day-to-day business activities the majority of these requirements should be met through the operational reports available within the applications such a Oracle Applications, SAP etc. The Data Warehouse is aimed at supporting management decisions within the company and should only be used when one or more of the following conditions are met:

● When regulatory reporting is demanded.
● For cross business reporting
● Where the subject matter is to be included within KPI measures
● Where historical snapshots are required
● Time series analysis is required

The warehouse will become the source for all business intelligence activities within the organisation. As shown in Figure 2.1 the data warehouse gathers a copy of transaction data used for generating management and statutory reporting and information retrieval and analysis. The equations shown in Diagram 2 are the common business drivers behind the information gathering exercise that is at the heart of creating the Data Warehouse:

better-decisions

and

competative-advantage

Diagram 3 – Business drivers for the Data Warehouse

The aim of the warehouse is to empower the business community with mission critical information at the time when decisions need to be taken. Each of the elements described in this diagram are discussed throughout the remainder of this chapter

Information
Information is a competitive resource that is in increasing demand in the business environment today.

Information is made up from data, but data is not information, Information, when looked at in relation to a Data Warehouse is much more than a collection of data. Information has three sub-components, the past, the present, and the future. The task of turning data into information relies on the following primary steps

● Extract relevant data from all relevant systems

● Transport data to the warehouse platform

● Stock source data tables

● Validate source data

● Identify timeliness of source data

● Assign consistent, repeatable keys

● Update dimensional attributes

● Update base measure attributes

● (Calculate derived measure attributes)

● Aggregate data using dimensional structure and business rules

Information is therefore processed data, associated with the structure of the organisation, its products etc.One of the keys to generating information is tying data to a timeline. All data items will relate to a time period. Legacy systems cannot be relied upon to record when events happened; it is therefore the duty of the Data Warehouse to identify this.Consider the questions posed in Diagrams above, these are typical problems faced.

One of the keys to generating information is tying data to a timeline. All data items will relate to a time period. Legacy systems cannot be relied upon to record when events happened; it is therefore the duty of the Data Warehouse to identify this. Consider the questions posed in Diagram 3, these are typical problems faced.

Within the context of the Data Warehouse project it is necessary to recognise that information is a resource that must be managed effectively. Data cannot be an afterthought; it is the foundation for the project and must be effectively managed.

Historic Information

Historical information could be seen as the total of all history (potentially including yesterday). Normally historical information will be seen as data that is not part of any current time period. Historic data is normally heavily summarised, and is rarely kept at a granular level.

The amount of historic data to be kept will depend on the business being analysed. Some industries, like Banks, are governed by strict statutory regulations enforced either by the national governments or by international organisations. Before starting the Data Warehouse project it is essential to have an idea about how much history is
required.

Today’s Information

As soon as the project starts processing data from source systems it will become obvious that the warehouse rarely stores any information about today. If today’s information were actually stored within the Data Warehouse then it would be subject to change during the continuing business day.

In Data Warehouse terms today’s information is the current period, the current day, week, month, quarter, year etc. The term “Current Period” has special significance within many OLAP reporting tools and can be used in order to determine reporting periods.

The current period is the most recent period within the Data Warehouse. In designing the database it will be necessary to decide on levels of granularity and the longevity of the data stored.

Future Information

So we have built our Data Warehouse and filled it with current and historic data, so where do we get future information from? The Data Warehouse is the natural home of forecasting and budgeting applications.

Analytics & Business Intelligence

The population of the Data Warehouse is simply the enabler to provide analytical information, the Starting point for Business Intelligence (BI). Analysis is not simply the production of weekly or monthly reports. It is at a very minimum taking reporting outputs, identifying trends and measurements of key business importance.

Analysis is defined in the Oxford Dictionary of English as:

Detailed examination of the elements or structure of something, typically as the basis for discussion or interpretation.

and:

The systematic and critical assessment by an organisation of a feature of a product to ensure that its cost is no greater than necessary to carry out its functions.

Within the context of the business intelligence application analysis is the discovery of key performance indicators though the use of the information stored within the Data Warehouse.

Analytical use of a Data Warehouse relies upon the use of Business Intelligence or On-Line Analytical Processing (OLAP) tools. These are reporting tools with the ability navigate, through the dimensional structure and show the impact of important measures contained within it. Analysis of the information contained within the warehouse will contribute to the decision making process of the organisation. This is supported by the ability to drill down for more detail or drill up to higher summary levels.

Data Mining also aids analysis of the data. Data mining tools are a specialist type of analytical tool that provides the capability of applying mathematical or statistical processes for the purpose of extracting hidden knowledge within large data sets. Data mining techniques provide information inherent through patterns or relations within the data. This data can be used to make predictions about future behaviour based upon learning algorithms. Data Mining will be applicable once the organisation has built up sufficient validated history within its Enterprise-wide Data Warehouse.

BI Supports Better Decisions

Improving the decision making process is always the goal of any corporation. The term “Better Decisions” means using the information contained within our Data Warehouse and applying relevant analysis in order to make informed decisions about the business. The decision is of course still a human action, what differs is that the person making the decision is making decisions based on facts, not on guess work.

better-decisions

The types of decisions made will differ according to the perspective of the decision maker. Diagram 4 looks at the differences between operational, tactical and strategic decision making. Each has a place within the corporate decision structure and must be supported by the Data Warehouse.

Operational, Tactical and Strategic Decision Making

Diagram 4 - Operational, Tactical and Strategic Decision Making

Operational decisions are based on effective and efficient use of facilities and resources to carry out activities within budgetary constraints. Operational decisions are based on questions that include: What do we sell? Where is it sold? Who did we sell it to? The answers to these questions will generally be based on detailed (highly granular) data. Operational decisions are by their nature responsive to a changing environment. There will be a need for periodic reporting, such as the ‘Monthly Sales Report by Salesman’ but this is accompanied by ad-hoc reporting with new reports generated when they are required. Today’s demands will be different than next month’s queries. Operational reporting tends to have limited summarisation.

Tactical decisions are based on acquisition of resources, tactics the establishment and monitoring of budgets. They deal primarily with past events. Analytical tools are required to understand and act on the day-to-day changes in market demand, variations in environmental factors that may be unique to a specific industry or area
of competition. Much of this type of analysis will be trend (or time series) based with a need to monitor how factors change over time, particularly with reference to industrial norms or in relation of specific competitors.

Strategic decisions are based on corporate goals, polices and determination of organizational objectives. Strategic decision making is about answering the following questions: What is the overall health of our business (now and in the future)? Are all the disparate parts of our organisation working towards the same goals? How many
new customers will we have next quarter/year? How many will we lose to the competition? What new products/services will we introduce? What impact will technology, government regulations and customer patterns and preferences have on our industry?

A Data Warehouse will provide the information that enables these questions to be answered via quantitative methods and analysis. Often this is provided by a combination of internal data and external research. The business will take the information contained within the Data Warehouse, analyse it, and use it to implement new goals, which in-turn require monitoring and further modification. This is a performance cycle that requires continuous monitoring for sustained success.

The next part to this article can be found at “Leveraging Information to Create a Competative Advantage“.

Leave a Reply

You must be logged in to post a comment.

blog comments powered by Disqus