 |
|
Bitmapped
Index Usage
Oracle Tips by Burleson Consulting
|
A bitmapped index is used for low-cardinality
data such as sex, race, hair color, and so on. If a column to be
indexed has a selectivity of greater than 30 to 40 percent of the
total data, then it is probably a good candidate for bitmap indexing.
Bitmap indexing is not suggested for
high-cardinality, high-update, or high-delete-type data, as bitmap
indexes may have to be frequently rebuilt in these type situations.
There are three things to consider when
choosing an index method:
* Performance
* Storage
* Maintainability
The major advantages of using bitmapped
indexes are performance impact for certain queries and their
relatively small storage requirements. Note, however, that bitmapped
indexes are not applicable to every query; and bitmapped indexes, like
B-tree indexes, can impact the performance of insert, update, and
delete statements. Bitmaps store large amounts of data about various
rows in each block of the index structure, and because bitmap locking
is at the block level, any insert, update, or delete activity may lock
an entire range of values.
Bitmapped indexes can provide very impressive
performance improvements. Under test conditions, the execution times
of certain queries improved by several orders of magnitude. The
queries that benefit the most from bitmapped indexes have the
following characteristics:
* The WHERE clause contains multiple
predicates on low-cardinality columns.
* The individual predicates on these
low-cardinality columns select a large number of rows.
* Bitmapped indexes have been created on some
or all of these low-cardinality columns.
* The tables being queried contain many rows.
An advantage of bitmapped indexes is that
multiple bitmapped indexes can be used to evaluate the conditions on a
single table. Thus, bitmapped indexes are very useful for complex ad
hoc queries that contain lengthy WHERE clauses involving
low-cardinality data.
Bitmapped indexes incur a small storage cost
and have a significant storage savings over B-tree indexes. A
bitmapped index can require 100 times less space than a B-tree index
for a low-cardinality column.
But with all those advantages in mind, you
must remember that a strict comparison of the relative sizes of B-tree
and bitmapped indexes is not an accurate measure for selecting
bitmapped over B-tree indexes. Because of the performance
characteristics of bitmapped indexes and B-tree indexes, you should
continue to maintain B-tree indexes on your high-cardinality data.
Bitmapped indexes should be considered primarily for your
low-cardinality data.
The storage savings incurred by bitmapped
indexes are so large because they replace multiple-column B-tree
indexes. In addition, single-bit values replace possibly long columnar
type data. When using only B-tree indexes, you must anticipate the
columns that will commonly be accessed together in a single query and
then create a multicolumn B-tree index on those columns. Not only does
this B-tree index require a large amount of space, but it will also be
ordered; that is, a B-tree index on (MARITAL_STATUS, RACE, SEX) is
useless for a query that only accesses RACE and SEX. To completely
index the database, you will be forced to create indexes on the other
permutations of these columns. In addition to an index on (MARITAL_STATUS,
RACE, SEX), there is a need for indexes on (RACE, SEX, MARITAL_STATUS),
(SEX, MARITAL_STATUS, RACE), and so on. For the simple case of three
low-cardinality columns, there are six possible concatenated B-tree
indexes. This means that you will be forced to decide between disk
space and performance when determining which multiple-column B-tree
indexes to create.
With bitmapped indexes, the problems
associated with multiple-column B-tree indexes are solved because
bitmapped indexes can be efficiently combined during query execution.
Three small single-column bitmapped indexes are a sufficient
functional replacement for six three-column B-tree indexes. Note that
while the bitmapped indexes may not be quite as efficient during
execution as the appropriate concatenated B-tree indexes, the space
savings provided by bitmapped indexes can often more than justify
their utilization.
The net storage savings will depend upon a
database’s current usage of B-tree indexes:
* A database that relies on single-column
B-tree indexes on high-cardinality columns will not observe
significant space savings (but should see significant performance
increases).
* A database containing a significant number
of concatenated B-tree indexes could reduce its index storage usage by
50 percent or more, while maintaining similar performance
characteristics.
* A database that lacks concatenated B-tree
indexes because of storage constraints will be able to use bitmapped
indexes and increase performance with minimal storage costs.
Bitmapped indexes are best for read-only or
light online transaction-processing (OLTP) environments. Because there
is no effective method for locking a single bit, row-level locking is
not available for bitmapped indexes. Instead, locking for bitmapped
indexes is effectively at the block level, which can impact heavy OLTP
environments. Note also that like other types of indexes, updating
bitmapped indexes is a costly operation.
Although bitmapped indexes are not appropriate
for databases with a heavy load of insert, update, and delete
operations, their effectiveness in a data warehousing environment is
not diminished. In such environments, data is usually maintained via
bulk inserts and updates. For these bulk operations, rebuilding or
refreshing the bitmapped indexes is an efficient operation. The
storage savings and performance gains provided by bitmapped indexes
can provide tremendous benefits to data warehouse users.
In preliminary testing of bitmapped indexes,
certain queries ran up to 100 times faster. The bitmapped indexes on
low-cardinality columns were also about 10 times smaller than B-tree
indexes on the same columns. In these tests, the queries containing
multiple predicates on low-cardinality data experienced the most
significant speed-ups. Queries that did not conform to this
characteristic were not assisted by bitmapped indexes. Bitmapped
composite indexes cannot exceed 30 columns.
Example Index Scenarios
The following sample queries on the CUSTOMERS
table demonstrate the variety of query-processing techniques that are
necessary for optimal performance.
Example 1: Single Predicate on a
Low-Cardinality Attribute
select *
from customers where gender = ‘male’;
Best approach: parallel table scan.
This query will return approximately 50
percent of the data. Since we will be accessing such a large number of
rows, it is more efficient to scan the entire table rather than use
either bitmapped indexes or B-tree indexes. To minimize elapsed time,
the server should execute this scan in parallel.
Example 2: Single Predicate on a
High-Cardinality Attribute
select *
from customers where customer# = 101;
Best approach: conventional unique index.
This query will retrieve at most one record
from the employee table. A B-tree index or hash cluster index is
always appropriate for retrieving a small number of records based upon
criteria on the indexed columns.
Example 3: Multiple Predicates on
Low-Cardinality Attributes
select *
from customers where gender = ‘male’ and region in (‘central’,
‘west’) and marital_status in (‘married’, ‘divorced’);
Best approach: bitmapped index.
Though each individual predicate specifies a
large number of rows, the combination of all three predicates will
return a relatively small number of rows. In this scenario, bitmapped
indexes provide substantial performance benefits.
Example 4: Multiple Predicates on Both
High-Cardinality and Low-Cardinality Attributes
select *
from customers where gender = ‘male’ and customer# < 100;
Best approach: B-tree index on CUSTOMER#.
This query returns a small number of rows
because of the highly selective predicate on customer#. It is more
efficient to use a B-tree index on customer# than to use a bitmapped
index on gender.
In each of the previous examples, the Oracle
cost-based optimizer transparently determines the most efficient
query-processing technique if the tables and indexes have
representative statistics present in the database.
The BITMAP clause (version 7.3.2 and later)
causes the index to be stored as a bitmap and should only be used for
low-cardinality data such as sex, race, and so on. The option is
available only as beta in pre-7.3.2.2 releases and is bundled with the
parallel query option. Several initialization parameters and event
settings are required to use the option in earlier versions of 7.3:
Initialization parameters (must be set
regardless of version):
COMPATIBLE
set to 7.3.2 or higher
V733_PLANS_ENABLED set to TRUE
Events (must be set prior to 7.3.2.3):
event = "10111 trace name context forever"
event = "10112 trace name context forever"
event = "10114 trace name context forever"
Creation of a Partitioned Index: An Example
As already noted, Oracle8 introduced the
concept of a partitioned index; and Oracle8i and Oracle9i have
continued and expanded upon partitioning concepts related to indexes.
A partitioned index goes hand in hand with partitioned tables. In
fact, usually a partitioned table will have partitioned indexes by
default. A prefixed index is defined as an index whose leftmost
columns correspond exactly with those of the partition key. In the
arena of partitioned indexes, the concept of prefixed indexes is
important because:
* Unique prefixed indexes guarantee that you
only need to access one index partition to get the data.
* Nonunique prefixed indexes still guarantee
you only need one index partition, if
you provide the full partition key as part of
the WHERE clause. The caveat to
this is that if you provide only part of the
partition key, all partitions will be scanned.
Let’s look at two quick examples.
CREATE TABLE
sales
(acct_no NUMBER(5) NOT NULL,
sales_person_id NUMBER(5) NOT NULL,
po_number VARCHAR2(10) NOT NULL,
po_amount NUMBER(9,2),
month_no NUMBER(2) NOT NULL)
PARTITION BY RANGE (month_no)
PARTITION first_qtr VALUES LESS THAN (4),
PARTITION sec_qtr VALUES LESS THAN (7),
PARTITION thrd_qtr VALUES LESS THAN(10),
PARTITION frth_qtr VALUES LESS THAN(13),
PARTITION bad_qtr VALUES LESS THAN (MAXVALUE));
CREATE INDEX pt_sales
ON sales (month_no,
sales_person_id,acct_no,po_number)
LOCAL;
Notice in this example that we didn’t have to
specify the index partitions. This is because we used the LOCAL clause
that tells Oracle to use the same partition logic as the master table.
A suitable prefix is added to differentiate the indexes. One problem
is that the indexes, if the location is not specified, will be placed
in the same tablespace as the table partitions. A better form to use
would be:
CREATE INDEX
pt_lc_sales
ON sales (month_no, sales_person_id,acct_no,po_number)
LOCAL(
PARTITION i_first_qtr TABLESPACE part_ind_tbsp1,
PARTITION i_sec_qtr TABLESPACE part_ind_tbsp2,
PARTITION i_thrd_qtr TABLESPACE part_ind_tbsp3,
PARTITION i_frth_qtr TABLESPACE part_ind_tbsp4,
PARTITION i_bad_qtr TABLESPACE part_ind_tbsp5);
The other choice is to use a GLOBAL index;
this is a partitioned index that doesn’t use the same partitioning as
the base table. Let’s look at an example:
CREATE INDEX
pt_gl_sales
ON sales (month_no, sales_person_id,acct_no,po_number)
GLOBAL
PARTITION BY RANGE (month_no)
(PARTITION i_gl_sales1 VALUES LESS THAN (6)
TABLESPACE sales_index1,
PARTITION i_gl_sales2 VALUES LESS THAN (MAXVALUE)
TABLESPACE sales_index2));
Here are some guidelines for the use of
partitioned indexes:
* Use local prefixed indexes whenever
possible.
* It is more expensive to scan a nonprefixed
index, due to more index probes required.
* Unique local nonprefixed indexes are not
supported.
* DML operations on global unique indexes are
not supported in parallel update.
* Global prefixed indexes can minimize the
number of index probes.
See
Code Depot for Full Scripts
 |
This is an excerpt
from Mike Ault, bestselling author of "Oracle
10g Grid and Real Application Clusters".
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts. |
 |
Expert Remote DBA
BC is America's oldest and largest Remote DBA Oracle support
provider. Get real Remote DBA experts, call
BC Remote DBA today. |
 |
|