 |
|
Update SQL, Subqueries, and Parallel DML
Oracle Tips by Burleson Consulting |
Updates with where constraints that
contain correlated subqueries are one of the most confounding
problems of the SQL update statement. This is because of the rule
that an update statement may only contain a single table name.
Hence, we not add a correlated subquery when the values of other
table rows influence our update decision.
To illustrate, consider the following SQL
that gives a 10 percent raise to all employees who did not receive a
bonus last year. To do this, we must execute a NOT IN anti-join
against the bonus table.
update
emp
set
sal= sal+ (sal*.1)
where
ename NOT IN
(select /*+ first_rows */
ename
from
bonus
where
emp.ename = bonus.ename
and
to_char(bonus_date,'YYYY') = '2000'
);
Now, let’s explain the query. We expect to see a
full-table scan against the emp table.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ----------------------------
----------
UPDATE STATEMENT
1
UPDATE
EMP
1
FILTER
1
TABLE ACCESS
FULL
EMP
1
TABLE ACCESS
BY INDEX ROWID
BONUS
2
INDEX
RANGE SCAN
BONUS_ENAME
1
As we expected, we see a full-table scan against the
emp table, followed by an index range scan against the
bonus table.
As you recall from Chapter 16, the NOT IN
anti-join is always serviced by a full-table scan against the outer
table, and you have learned that a NOT IN subquery can often be
replaced with an outer join and a where bonus IS NULL.
However, since SQL only allows a single table name in the outer
query, it is impossible to remove the NOT IN clause by replacing it
with a standard outer join.
So, given that we are stuck with a full-table
scan, how can we improve the performance of the update statement? If
we are on an Oracle server with lots of CPUs, we can add a
parallel hint to the SQL update statement. Here, we
invoke 25 parallel processes to partition and update the emp
rows:
update /*+
parallel(emp,35) */
emp
set
sal= sal+ (sal*.1)
where
ename NOT IN
(select /*+ first_rows */
ename
from
bonus
where
emp.ename = bonus.ename
and
to_char(bonus_date,'YYYY') = '2000'
);
Here is the revised execution plan:
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ----------------------------
----------
OTHER_TAG
----------------------------------------------------------------------
UPDATE STATEMENT
1
UPDATE
EMP
1
FILTER
1
TABLE ACCESS
FULL
EMP
1
PARALLEL_TO_SERIAL
TABLE ACCESS
BY INDEX ROWID
BONUS
2
INDEX
RANGE SCAN
BONUS_ENAME
1
It might be tempting to try to improve the speed of
the update scan by adding a function-based index on the
bonus_date column to see if the CBO will choose to use the
bonus_date instead of the ename column.
SQL> create
index bonus_date_year on bonus (to_char(bonus_date,'YYYY'));
However, the CBO recognizes that the ename
index has far more unique values than our bonus_date, and
this will have no effect of the execution plan for the query.
The main point of this section is that
correlated subqueries are unavoidable for complex DML because of the
rule that only one table name can appear in the DML statement.
Hence, our only way to improve performance is to add parallel
DML hints to improve the speed of the required full-table scan.
Oracle also provides parallelism for all SQL
update, insert and insert as select operations, as
well as parallel index re-building. For example, when rebuilding a
global partitioned index, Oracle will recognize that the index is
partitioned and allow parallel query processes to independently read
each partition of the index.
Next, let’s take a look at how referential
integrity (RI) constraints affect the performance of DML statements.
This is an excerpt from "Oracle High-Performance SQL Tuning" by
Donald K. Burleson, published by Oracle Press.