 |
|
Preventing Unwanted Full-Table Scans
Oracle Tips by Burleson Consulting |
There are several conditions that contribute
to an unnecessary full-table scan, and it is the job of the SQL
tuning professional to locate and correct these conditions. They
include both system-wide dictionary issues and issues with
individual SQL statements.
Dictionary Issues
There are several issues with regard to
information in the Oracle data dictionary that can cause the SQL
optimizers to invoke a full-table scan against a table.
Tables with a Too-High High-Water Mark
These tables should be reorganized to lower
the high-water mark to the appropriate level. When a substantial
number of rows are deleted from a table, the high-water mark remains
at the prior level, giving the CBO a false sense of the size of the
table (see Figure 6-1).
Figure 1: A
misleading high-water mark
No Column Histograms on Skewed Indexes
An index with highly skewed values should be
analyzed for column histograms so that the CBO will know when it is
proper to use an index. For example, assume we have a bitmapped
index on a region column, and that 90 percent of the values are for
the Northern region (see Figure 6-2).
Figure 2: A skewed
index distribution
In this case, queries against the Southern,
Eastern, and Western regions will be faster using the index, while
queries that specify the Northern region will be better with a
full-table scan. It should be noted that Oracle9i will offer
additional enhancements to histogram functionality. Instead of the
Remote DBA needing to determine which indexes have skewed distributions,
the Oracle9i DBMS_STATS package will identify and gather
histogram statistics on columns that it has determined will benefit
from those statistics.
Using CHOOSE or ALL_ROWS as the optimizer_mode
These modes favor the full-table scan.
Setting Parallel Query
Setting a table to allow parallel query (for
example, alter table customer parallel 35;) will often cause
the CBO to invoke a false full-table scan. Be very careful when
altering a table for parallelism because it can have disastrous
effects on query performance.
SQL Syntax and Full-Table Scans
Within the scope of a SQL statement, there
are many conditions that will cause the SQL optimizer to invoke a
full-table scan. The Oracle SQL tuner should always be on the
lookout for these types of conditions.
Queries with NULL Conditions
Oracle cannot use an index to select NULL
column values because the NULLs are not stored in the index. The
example that follows would invoke a full-table scan:
select
emp_name
from
emp
where
middle_name IS NULL;
Some developers get around the issue by replacing all
NULL values with a literal “N/A” and then searching for N/A values.
The following actions would prevent a full-table scan:
update
emp
set
middle_name to ‘N/A’
where
middle_name IS NULL;
select
emp_name
from
emp
where
middle_name = ‘N/A’;
Queries Against Unindexed Columns
It is not uncommon to see queries in the
library cache that specify columns in the where clause that
do not have indexes. Finding these types of SQL statements is
especially pleasant because an index can be quickly created to
dramatically improve the performance of the query.
Queries with Like Conditions
Queries that use the like clause will
invoke a full-table scan if the percent sign mask is used in the
leading side of the query. For example, the following clause would
not cause a full-table scan, because the like mask begins
with characters and the existing index will be able to service the
query.
select
ename
job,
hiredate
from
emp
where
ename like ‘BURLE%’;
Here we see that the ename_idx can be used for
this like condition.
OPERATION
----------------------------------------------------------------------
OPTIONS OBJECT_NAME
POSITION
------------------------------ ----------------------------
----------
SELECT STATEMENT
1
INDEX
RANGE SCAN ENAME_IDX
1
However, we run into full-table scans when the like
mask has the percent sign in the beginning of the mask:
select
ename
job,
hiredate
from
emp
where
ename like ‘%SON’;
Here we see that a full-table scan is invoked and the
index cannot be used.
OPERATION
----------------------------------------------------------------------
OPTIONS OBJECT_NAME
POSITION
------------------------------ ----------------------------
----------
SELECT STATEMENT
26
INDEX
FULL SCAN ENAME_IDX
1
If we have a lot of queries that use the like mask
with a beginning percent sign and trailing character values, we can
create a function-based index using the reverse BIF:
create
index
ename_reverse_idx
on emp
( reverse(ename) );
Next, we slightly alter our query to use the reverse
operator:
select
ename
job,
hiredate
from
emp
where
reverse(ename) like ‘NOS%’;
Here we see that our function-based index can be used
for the trailing like operator:
OPERATION
----------------------------------------------------------------------
OPTIONS OBJECT_NAME
POSITION
------------------------------ ----------------------------
----------
SELECT STATEMENT
1
TABLE ACCESS
BY INDEX ROWID EMP
1
INDEX
RANGE SCAN ENAME_REVERSE_IDX
1
Queries with a Not Equals Condition
You can specify a not equals condition
in three ways in Oracle SQL. The following statements are identical:
select
ename from emp where job <> 'MANAGER';
select ename from emp where job != 'MANAGER';
select ename from emp where job not in ('MANAGER');
The not equals condition or a not
in condition will always use a full-table scan unless the column
values are skewed and column histograms indicate that the index scan
is faster. To illustrate, consider a region index where 90 percent
of the values are for the Northern region. The following query would
rightly choose a full-table scan because the index is very
nonselective.
select
customer_name,
customer_status
from
customer
where
region<>’Southern’.
However, the following query would be better served by
performing an index range scan because only a minority of the values
are not in the Northern region.
select
customer_name,
customer_status
from
customer
where
region<>’Northern’.
Invalidating an Index with a BIF
When an index column is altered with a BIF in
the where clause, the original index cannot be used. This
problem is especially prevalent with DATE datatypes where the date
is transformed to get a range of values. The following example will
use the date index.
select
customer_name,
customer_status
from
customer
where
sales_date < sysdate-7;
However, a query for July 2001 sales would not be able
to use the index because of the use of the to_char built-in
function (BIF).
select
customer_name,
customer_status
from
customer
where
to_char(sales_date,’YYYY-MM’)=’2001-06’;
In cases where a function is commonly used
against a column, function-based indexes can be created on the base
table to remove the full-table scan. Function-based indexes are most
commonly used with the following BIFS:
-
to_char These are commonly used
when extracting date datatype values for a day of the week (to_char(mydate,’day’)),
an hour of the day (to_char(mydate,’HH24’)), or a specific
range of dates.
-
substr The substr
function is often used to query subsets of fixed-length character
columns.
-
decode The decode function is
commonly used to translate abbreviated column identifiers into
displayable characters.
The Oracle tuning professional should always
be on the lookout for BIFs in SQL statements. The following SQL can
be run against the v$sqlarea view to quickly identify SQL
statements that might have a BIF in their where clause.
bif.sql
set lines
2000;
select
sql_text,
disk_reads,
executions,
parse_calls
from
v$sqlarea
where
lower(sql_text) like '% substr%'
or
lower(sql_text) like '% to_char%'
or
lower(sql_text) like '% decode%'
order by
disk_reads desc
;
Again, it is very important that the Oracle
professional constantly be on the lookout for BIF that are causing a
full-table scan. You can be the hero in these cases by quickly
creating a function-based index to speed up the query.
Using an ALL_ROWS hint
Developers often add ALL-ROWS hints after
reading that they will improve throughput, not realizing that
response time will suffer. The ALL_ROWS optimizer goal is designed
to improve throughput and tends to favor full-table scans. For any
database that requires SQL queries to return some of the result set
quickly, the optimizer_mode is set to FIRST_ROWS.
Using a Parallel Hint
Setting a parallel hint in a query will
always cause the CBO to invoke a full-table scan. This is because
all Oracle parallel queries must perform full-table scans in order
to dedicate multiple query slaves. Many beginners will add a
parallel hint to a SQL query, without testing to ensure that the
parallel full-table scan is faster than using the index.
Now that we have covered full-table scans,
let’s take a look at methods for improving the performance of index
range scans. Once we have tuned our SQL queries to use the index,
there are several techniques that can be used to improve access
speed, especially for queries that perform index range scans against
a large number of rows.
This is an excerpt from "Oracle High-Performance SQL Tuning" by
Donald K. Burleson, published by Oracle Press.