 |
|
Using Parallel Query Hints
Oracle Tips by Burleson Consulting |
Invoking the parallel query with hints
requires several steps. The most important is that the execution
plan for the query must specify a full-table scan. If the output of
the execution plan does not indicate a full-table scan, the query
can be forced to ignore the index by using the full hint.
There are two permutations of parallel hints, the parallel
hint and the pq_distribute hint. Let’s take a look at each
and see how they work.
The parallel Hint
The number of processors dedicated to service
a SQL request is ultimately determined by Oracle Query Manager, but
the programmer can specify the upper limit on the number of
simultaneous processes. When using the cost-based optimizer, the
parallel hint can be embedded into the SQL to specify the number
of processes. For instance:
select /*+
FULL(employee_table) PARALLEL(employee_table, 35) */
employee_name
from
employee_table
where
emp_type = 'SALARIED';
If you are using an SMP or MPP database server with
many CPUs, you can issue a parallel request and leave it up to each
Oracle instance to use its default degree of parallelism. For
example:
select /*+
FULL(employee_table) PARALLEL(employee_table, DEFAULT, DEFAULT) */
employee_name
from
employee_table
where
emp_type = 'SALARIED';
In most cases, it is better for the Oracle
Remote DBA to determine the optimal degree of parallelism and then set that
degree in the data dictionary with the following command:
Alter table
employee_table parallel degree 35;
This way, the Remote DBA can always be sure of the degree of
parallelism for any particular table.
Oracle also provides the
parallel_automatic_tuning init.ora parameter to assist in
setting the best degree of parallelism. When setting
parallel_automatic_tuning, you only need to specify parallelism
for a table, and Oracle will dynamically change the
parallel_adaptive_multi_user parameter to override the execution
plan in favor of maintaining an acceptable overall load on the
database. You should also note that setting
parallel_automatic_tuning will cause extra storage in the large
pool because Oracle will allocate message buffers from the large
pool instead of the shared pool.
The pq_distribute Hint
The pq_distribute hint is used in data
warehouses to improve parallel join operation performance when using
partitioned tables. The pq_distribute hint allows you to
specify how rows of joined tables should be distributed among
producer and consumer parallel query servers. The pq_distribute
hint accepts three parameters: the table name, the outer
distribution, and the inner distribution.
As I discussed in Chapter 10, we always want
to avoid the PARALLEL_TO_PARALLEL execution plan when performing a
parallel query join. Performing a PARALLEL_TO_PARALLEL operation
means the incoming and outgoing data streams are parallelized,
resulting in slow join performance. On the other hand, invoking the
PARALLEL_COMBINED_WITH_PARENT operation means that sort and merge
operations are combined into one operation.
Prior to the use of the pq_distribute
hint, Oracle Remote DBAs would often fake out the SQL optimizer by deleting
the CBO statistics on the inner table to force the
PARALLEL_COMBINED_WITH_PARENT operation. This is because the SQL
optimizer evaluates the size of candidate broadcast tables according
to the CBO statistics. If a table is above a threshold value, the
table will be joined via the PARALLEL_TO_PARALLEL execution mode,
resulting in very slow execution times.
There are six acceptable combinations for
table distribution with the pq_distribute hint. We use the
emp table in these examples. Remember that the order of the
parameters is outer distribution followed by inner distribution.
-
pq_distribute(emp,
hash, hash) This maps the rows of each table to consumer
parallel query servers using a hash function on the join keys.
When mapping is complete, each query server performs the join
between a pair of resulting partitions. This hint is recommended
when the tables are comparable in size and the join operation is
implemented by hash join or sort merge join.
-
pq_distribute(emp, broadcast, none)
This ensures that all rows of the outer table are broadcast to
each parallel query server, while the inner table rows are
randomly partitioned. This hint is recommended when the outer
table is very small compared to the inner table. A rule of thumb
is to use the Broadcast/None hint if the size of the inner table times the
number of parallel query servers is greater than the size of the
outer table.
-
pq_distribute(emp, none, broadcast)
This forces all rows of the inner table to be broadcast to
each consumer parallel query server. The outer table rows are
randomly partitioned. This hint is recommended when the inner
table is very small compared to the outer table. A rule of thumb
is to use the None/Broadcast hint if the size of the inner table times the
number of parallel query servers is less than the size of the
outer table.
-
pq_distribute(emp, partition, none)
This maps the rows of the outer table using the partitioning
of the inner table. The inner table must be partitioned on the
join keys. This hint is recommended when the number of partitions
of the outer table is equal to or nearly equal to a multiple of
the number of parallel query servers.
-
pq_distribute(emp, none, partition)
This combination maps the rows of the inner table using the
partitioning of the outer table. The outer table must be
partitioned on the join keys. This hint is recommended when the
number of partitions of the outer table is equal to or nearly
equal to a multiple of the number of query servers.
-
pq_distribute(emp, none, none)
Each parallel query server performs the join operation between a
pair of matching partitions, one from each table. Both tables must
be equi-partitioned on the join keys.
This is an excerpt from "Oracle High-Performance SQL Tuning" by
Donald K. Burleson, published by Oracle Press.