Oracle SQL Join
Oracle Tips by Burleson Consulting
How is join cardinality estimated?
In the absence of column histograms, the Oracle
CBO must be able to guess on information, and sometimes the guess is
wrong. This is one reason why the ORDERED hint is one of the most
popular SQL tuning hints, because using the ORDERED hint allows the
DBA to specify that the tables be joined together in the same order
that they appear in the FROM clause, like this:
select /+ ORDERED */
professor = ‘jones’
course = ‘anthropology 610’;
If the values for the PROFESSOR and COURSE table
columns are not skewed, it is unlikely that the 10g automatic
statistics would have created histograms buckets in the
dba_histograms view for these columns.
The Oracle CBO needs to be able to accurately
estimate the final number of rows returned by each step of the query
and then use schema metadata from running
dbms_stats to choose the
table join order that results in the least amount of baggage, in the
form of intermediate rows, from each of the table join operations.
This is a daunting task. When an SQL query has
a complex WHERE clause, it can be very difficult to estimate the
size of the intermediate result sets, especially when the WHERE
clause transforms column values with mathematical functions. Oracle
has made a commitment to making the CBO infallible, even when
incomplete information exists.
Oracle9i introduced the new dynamic sampling
method for gathering run-time schema statistics, and it is now
enabled by default in Oracle10g.
It's well established that histograms are very useful for helping
the optimizer choose between a full-scan and index-scan, and we
also see evidence that histograms help Oracle figure out inter-table
join cardinality (the amount of "baggage" that Oracle must carry
between table joins). The presence of histograms helps the CBO
determine the optimal table join order.
Oracle expert David Aldridge just published an outstand treatise
for join predicates" where he does an outstanding job explaining
the problem and how to use histograms as a solution. The thrust of
the argument is that histograms will help detect "skew" in table
join columns (STATE_NAME) when using synthetic keys (STATE#).
Aldridge shares his conclusions:
To sum up,
here are the solutions I thought of:
- Partition or subpartition the fact
table on STATE#. (preferred option)
- Create a summary table with
partitioning or subpartitoning on
STATE#. (uses most space and slows data
load, but very flexible and powerful)
- Create a function-based index on
fact to perform the lookup, and query
that value instead. (a bit flaky, but it
works without major system impact)
- Rebuild the fact table based on the
STATE_NAME instead. (still limited in
SEE CODE DEPOT FOR FULL SCRIPTS
This is an excerpt from my latest book "Oracle
Tuning: The Definitive Reference".
You can buy it direct from the publisher for 50%-off and get
instant access to the code depot of Oracle tuning scripts:
Expert Remote DBA
BC is America's oldest and largest Remote DBA Oracle support
provider. Get real Remote DBA experts, call
BC Remote DBA today.