Oracle Essentials [Electronic resources] : Oracle Database 10g, 3rd Edition نسخه متنی

9.1 Business Intelligence Basics

Why build a data warehouse or business
intelligence solution? Why is the data in an On-Line Transaction
Processing (OLTP) database only part of a business intelligence
solution? Data warehouses are often designed with the following in
mind:

Strategic analyses discern trends in data, rather than individual facts.

Data warehouses often are used to create simple reports based on
aggregate values culled from enormous amounts of data. If OLTP
databases attempted to create these aggregates on the fly they would
use a lot of resources, which would impact their ability to process
transactions in a timely manner.

The information in a data warehouse is more commonly read-only, with infrequent updates.

This means that the overhead of transaction control, an important
part of a normal OLTP database system, isn't really
needed for a data warehouse.

The data used for analyses may not require up-to-the-minute accuracy.

Because strategic analysis is concerned with trends over time, the
data used can be a day, a week, or even a month old, depending on the
analysis being done. This means that data in a data warehouse can
have some or all of its aggregate values created as part of a batch
process offline. However, as warehouses are used for tactical
management, near real-time accuracy emerges as a requirement.

The design required for an efficient data warehouse differs from the standard normalized design for a relational database.

Queries are typically read against a fact table, which may contain
summary information using a specific type of schema design called a
star
schema. This design lets you access facts quite flexibly
along key dimensions or "look-up"
values. (The star schema is described in more detail later, in Section 9.2.)
For instance, a data warehouse user may want to compare the total
amount of sales, which comes from a fact table, by region, store in
the region, and items, all of which can be considered key dimensions.

9.1.1 The Evolution of Business Intelligence

Business
intelligence is not a new idea. The use of corporate data for
strategic decision making beyond simple tracking and day-to-day
operations has gone on for almost as long as computing itself.

Quite early, builders and users of operational systems recognized
potential business benefits of analyzing the data in complementary
systems. In fact, much of the early growth in personal computers was
tied to the use of spreadsheets that performed analyses against data
downloaded from the operational systems. Business executives began to
direct IT efforts toward building solutions used to obtain a better
understanding of business using existing business data. This
understanding enabled new business strategies. Many of these early
joint initiatives were successful data warehousing projects. Today,
data warehouses are used in business areas such as customer
relationship management, sales and marketing campaign analysis,
product management and packaging, financial analysis, supply chain
analysis, and risk and fraud analysis.

In the 1980s, many organizations began using dedicated servers for
these applications, which were collectively known as
decision support systems
(DSS). Decision-support queries tended to be particularly
CPU and memory intensive and took place in a primarily read-only
environment, whereas traditional OLTP was typically I/O intensive
with a large number of updates. The characteristics of queries were
much less predictable (e.g., more
"ad-hoc") than what had been
experienced in OLTP systems. This led to the development of data
stores for decision support apart from those for OLTP.

When Bill Inmon (whose books are noted in Appendix B) and others
popularized the term "data
warehouse" in the early 1990s, a formalized common
infrastructure for building a solution came into being. Since then,
the topology of the data warehouse has continued to evolve, as the
next section illustrates.

9.1.2 A Topology for Business Intelligence

Although
data warehouses were initially conceived as a large, enterprise-wide
source of all information, their topology often now appears as a
multi-tier architecture, as shown in Figure 9-1.

Figure 9-1. Typical business intelligence topology

The topology shown in Figure 9-1 is the mature result of years of
experience. The evolution from a traditional client/server
environment to multiple tiers occurred for a variety of reasons.
Initial efforts at creating a single warehouse often resulted in
"analysis paralysis." Just as
efforts to define an enterprise-wide OLTP model often take years (due
to cross-departmental politics and the scope of the effort), similar
attempts in data warehousing also took much longer than businesses
were willing to accept. These efforts were further hampered by the
continually changing analysis requirements necessitated by a
continually changing marketplace. While the data elements and
requirements for operational systems can remain relatively stagnant
over time, understanding business trends can be like trying to catch
lightning in a bottle.

Consequently, attempts at building such enterprise-wide models in an
effort to satisfy everyone often satisfied no one.

9.1.3 Data Marts

When some large-scale, enterprise-only data warehouse efforts ended
in dismal failure, frustration and impatience followed. Some people
reacted and built department-focused independent
data marts by
extracting data from the appropriate operational source systems. Many
data marts were initially quite successful because they fulfilled a
specific business need relatively quickly.

However, problems began to surface. There was often no coordination
between departments regarding basic definitions, such as
"customer." If a senior manager
asked the same question of multiple departments, the answers provided
by these independent data marts were often different, thus calling
into question the validity of any of the marts. Many departments also
encountered ongoing difficulty in managing these multiple data marts
and in maintaining extractions from operational sources (which were
often duplicated across multiple departments).

As decision-support architects took another look at their solutions,
they began to realize that it was very important to have a consistent
view of the detailed data at an enterprise data warehouse level. They
also saw that data marts could solve business problems and provide
return on investment in an incremental fashion. Today, most
successful implementers simultaneously grow dependent data marts one
business solution at a time while growing the enterprise warehouse
server in an incremental fashion.

The currently accepted definition of a data mart is simply a subject-
or application-specific data warehouse, usually implemented within a
department. Typically, these data marts are built for performance and
may include a large number of summary tables. Data marts were
initially thought of as being small, because not all the detail data
for a department or data from other departments need be loaded in the
mart. However, some marts get quite large as they incorporate data
from outside sources (sometimes purchased) that
isn't relevant in other parts of the business.

In some companies, data marts are deployed to meet specific project
goals with models optimized for performance for that particular
project. Such data marts are retired when the project is completed
and the hardware is reused for other projects. As the analysis
requirements for a business change, the topology of any particular
data warehouse is subject to evolution over time, so developers must
be aware of this possibility.

Increasingly, companies interested in costs savings are reexamining
the wisdom of having a large number of data marts and are
consolidating marts where possible. Recent new database features
enabling effective management of different user communities helped to
make consolidation possible.

9.1.4 Operational Data Store and Enterprise Warehouse

The operational data
store (ODS) concept also grew in popularity in the 1990s.
The ODS may best be described as a distribution center for current
data. Like the OLTP servers, the schema is highly normalized and the
data is recent. The ODS serves as a consolidation point for reporting
and can give the business one location for viewing current data that
crosses divisions or departments. The popularity of the ODS grew in
part as a result of companies in the midst of acquisitions and
mergers. These organizations often face mixed-application
environments. The ODS acts as a staging location that can be used as
the source for further transformations into a data warehouse or into
data marts.

The warehouse server, or
enterprise data
warehouse, is a multi-subject historical information store
usually supporting multiple departments and often serving as the
corporate database of record. When an ODS is established, the
warehouse server often extracts data from the ODS. When an ODS
isn't present, data for the warehouse is directly
extracted and transformed from operational sources. External data may
also feed the warehouse server.

As noted previously in the data mart discussion, platform
consolidation is being examined within these tiers today. The
enterprise data warehouse can be the point of consolidation for the
ODS and multiple data marts. Although different logical models
remain, they are sometimes consolidated to a single platform.

Oracle Essentials [Electronic resources] : Oracle Database 10g, 3rd Edition نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی