The Data Warehouse Development Life Cycle
Aggregating Data For The Oracle Warehouse
Determining The Number Of Aggregate Tables
Now that we have listed the data attributes that
might become dimensions in pre-summarized tables, it becomes
apparent that pre-calculating all of the possible dimensions would
involve many possible combinations. But how many summary tables?
Let’s start with a simple example. Assume that we have four
attributes: A, B, C, and D. We would have six possible combinations
of attributes, namely, A-B, A-C, A-D, B-C, B-D, and C-D. As it turns
out, the following formula can be used to determine the number of
possible tables:
Number of dimension summary tables = (n)! / (n-2)! * 2
Where n = the number of dimension attributes
For four dimensions, we can quickly compute that there are six
summary tables that can be built against the database, as follows:
Number of dimension summary tables = (4)! / (4-2)! * 2
Number of dimension summary tables = 24 / 4
Number of dimension summary tables = 6
However, in our example, we have 10 dimensions. We can compute the
number of possible combinations of these attribute tables, as
follows:
Number of dimension summary tables = (10)! / (10-2)! * 2
Number of dimension summary tables = 3,628,800 / 40,320 * 2
Number of dimension summary tables = 3,628,800 / 80,640
Number of dimension summary tables = 45
Note: The number of combinations equals 45, but the real number of
tables we need is 90 since we must aggregate two facts. This is
because we have to perform the calculations twice--once to summarize
the total_cost fact and again for the quantity_sold fact.
Also, we will not be able to create summary hierarchies like our
Excel pivot table did in Chapter 5, so each aggregate table will be
isolated and referenced by some permutation of the dimension names.
If we are going to create 45 tables for each fact, we need to
provide meaningful names to our tables. Sample table names might
include MONTH_BY_YEAR, MONTH_BY_QUARTER, and MONTH_BY_CUSTOMER_NAME.