|
|
|
Oracle SQL all_rows, first_rows, rule
In order to understand the evolution of SQL
optimization we have to take a historical
perspective. In the early 1980s when commercial
relational databases were first introduced, the
SQL optimizers were very primitive and relied on
simple heuristic in order to determine the
optimal execution plan for any given query.
This was called rule-based optimization. The
rule-based optimizer uses simple data dictionary
statistics in order to derive the SQL execution
plan.
However,
rule-based optimization is not aware of
detailed statistics about the nature of the
tables and indexes. For example, the
rule-based optimizer does not know the
number of distinct values within an index,
or the distribution of values within index.
Hence, rule-based optimizer is far more
likely than cost-based optimization to
choose an inappropriate index to service
take query. An inappropriate index is an
index that is not selective. For example,
consider this query:
select
book_title
from
book
where
book_type = ‘computer’
and
book_title = ‘DOS for Dummies’;
In this
case, a book_type index or the book_title
index could be used, but the book_title
index is likely to be far more selective
than the book_type index.
Recognizing
the shortcomings of rule-based SQL
optimization, the relational database
vendors began to add additional intelligence
into their SQL optimizing techniques. In
order to choose the best execution plan for
a given query, the vendors recognized that
it was necessary to gather detailed
statistics about the tables and indexes.
Some of these statistics include:
-
The
number of rows in a table
-
The
average row length of rows in the table
-
The
carnality of the indexes on the table
-
The
distribution of data column values
within the table
The
cost-based optimizer was created to use
these statistics in order to make a more
intelligent decision about the best way to
service to query. However here we must
refine our definition of what we mean by
“best” execution plan.
Is the best
execution plan the one that begins to return
rows to the query the fastest, or is the
best execution plan the one that services
the query with the minimal amount of machine
resources? Let's illustrate this choice
with a simple example.
Assume that
we have a simple query that selects
1,000,000 rows from the customer table, and
orders the result by customer name:
select cust_name from customer order by
cust_name;
Let's also
assume that we have an index on the
cust_name column. The SQL optimizer has a
choice of methods to produce the result set:
Choice 1
- The database can use the cust_name index
to retrieve the customer table rows. This
will alleviate the need for sorting the
result set at the end of the query, but
using the index has the downside of causing
additional I/O within the database as the
index nodes are accessed.
Choice 2
- The database can perform a parallel full
table scan against the table and then sort
the result set on desk. This execution plan
will generally result in less overall disk
I/O resources than using the index, but the
downside to this optimization technique that
no rows from the query will be available
until the entire query has been completed.
For a giant query, this could take several
minutes.
Hence, we
see two general approaches to SQL query
optimization. The use of the indexes to
avoiding sorting been codified within Oracle
as the first_rows optimization technique.
Under first_rows optimization, the optimizer
goal is to begin to return rows to the query
as quickly as possible, even if it means
extra disk I/O. Conversely, the all_rows
optimizer goal is designed to minimize
overall machine resources. Under all_rows
optimization the goal is to minimize the
amount of machine resources and disk I/O for
the query. Hence, the all_rows optimizer
mode tends to favor full table scans, and is
generally used in large data warehouses
where immediate response time is not
required.
|
|
|
|
|
|
| |
BC
Remote Oracle Support
P.O. Box 511 • Kittrell, NC, 27544

Remote DBA
Copyright © 1996 -
2009 by Burleson Enterprises, Inc. All rights reserved.
Oracle® is the registered trademark of Oracle Corporation.
|
|