 |
|
Tuning Oracle SQL Sorting with STATSPACK
Oracle Tips by Burleson Consulting |
As a small but very important component of
SQL syntax, sorting is a frequently overlooked aspect of Oracle
tuning. In general, an Oracle database will automatically perform
sorting operations on row data as requested by a create index
or a SQL order by or group by statement. In general,
Oracle sorting occurs under the following circumstances:
-
SQL using the ORDER BY clause
-
SQL using the GROUP BY clause
-
When an index is created
-
When a MERGE SORT is invoked by the SQL
optimizer because inadequate indexes exist for a table join
At the time a session is established with
Oracle, a private sort area is allocated in RAM for use by the
session for sorting. If the connection is via a dedicated
connection, a Program Global Area (PGA) is allocated according to
the sort_area_size init.ora parameter. For connections
via the multithreaded server, sort space is allocated in the
large_pool. Unfortunately, the amount of memory used in sorting
must be the same for all sessions, and it is not possible to add
additional sort areas for tasks that require large sort operations.
Therefore, the designer must strike a balance between allocating
enough sort area to avoid disk sorts for the large sorting tasks and
keeping in mind that the extra sort area will be allocated and not
used by tasks that do not require intensive sorting. Of course,
sorts that cannot fit into the sort_area_size will be paged
out into the TEMP tablespaces for a disk sort. Disk sorts are about
14,000 times slower than memory sorts.
TIP: When using SQL*Plus, or when a user
knows that he or she will perform a large sort, the alter session
set sort_area_size command can be used to enlarge the sort area.
As I noted, the size of the private sort area
is determined by the sort_area_size init.ora
parameter. The size for each individual sort is specified by the
sort_area_retained_size init.ora parameter. Whenever a sort
cannot be completed within the assigned space, a disk sort is
invoked using the temporary tablespace for the Oracle instance.
Disk sorts are expensive for several reasons.
First, they are extremely slow when compared to an in-memory sort.
Also, a disk sort consumes resources in the temporary tablespace.
Oracle must also allocate buffer pool blocks to hold the blocks in
the temporary tablespace. In-memory sorts are always preferable to
disk sorts, and disk sorts will surely slow down an individual task
as well as impact concurrent tasks on the Oracle instance. Also,
excessive disk sorting will cause a high value for free buffer
waits, paging other tasks’ data blocks out of the buffer.
The following STATSPACK query uses the
stats$sysstat table. From this table we can get an accurate
picture of memory and disk sorts.
rpt_sorts_alert.sql
set pages
9999;
column mydate heading 'Yr. Mo Dy Hr.' format a16
column sorts_memory format 999,999,999
column sorts_disk format 999,999,999
column ratio format
.99999
select
to_char(snap_time,'yyyy-mm-dd HH24') mydate,
newmem.value-oldmem.value sorts_memory,
newdsk.value-olddsk.value sorts_disk,
((newdsk.value-olddsk.value)/(newmem.value-oldmem.value)) ratio
from
perfstat.stats$sysstat oldmem,
perfstat.stats$sysstat newmem,
perfstat.stats$sysstat newdsk,
perfstat.stats$sysstat olddsk,
perfstat.stats$snapshot sn
where
newdsk.snap_id = sn.snap_id
and
olddsk.snap_id = sn.snap_id-1
and
newmem.snap_id = sn.snap_id
and
oldmem.snap_id = sn.snap_id-1
and
oldmem.name = 'sorts (memory)'
and
newmem.name = 'sorts (memory)'
and
olddsk.name = 'sorts (disk)'
and
newdsk.name = 'sorts (disk)'
and
newmem.value-oldmem.value > 0
and
newdsk.value-olddsk.value > 100
;
Here is the output from the script. Here, you can
clearly see the number of memory sorts and disk sorts, and the ratio
of disk to memory sorts.
Yr.
Mo Dy Hr. SORTS_MEMORY SORTS_DISK
RATIO
---------------- ------------ ------------ -------
2000-12-20 12
13,166 166
.01261
2000-12-20 16
25,694 223
.00868
2000-12-21 10
99,183 215
.00217
2000-12-21 15
13,662 130
.00952
2000-12-21 16
17,004 192
.01129
2000-12-22 10
18,900 141
.00746
2000-12-22 11
19,487 131
.00672
2000-12-26 12
12,502 147
.01176
2000-12-27 13
20,338 118
.00580
2000-12-27 18
11,032 119
.01079
2000-12-28 16
16,514 205
.01241
2000-12-29 10
17,327 242
.01397
2000-12-29 16
50,874 167
.00328
2001-01-02 08
15,574 108
.00693
2001-01-02 10
39,052 136
.00348
2001-01-03 11
13,193 153
.01160
2001-01-03 13
19,901 104
.00523
2001-01-03 15
19,929 130
.00652
This report can be changed to send an alert when the
number of disk sorts exceeds a predefined threshold, and we can also
modify it to plot average sorts by hour of the day and day of the
week. The script that follows computes average sorts, ordered by
hour of the day.
rpt_avg_sorts_hr.sql
set pages
9999;
column sorts_memory format 999,999,999
column sorts_disk format 999,999,999
column ratio format
.99999
select
to_char(snap_time,'HH24'),
avg(newmem.value-oldmem.value) sorts_memory,
avg(newdsk.value-olddsk.value) sorts_disk
from
perfstat.stats$sysstat oldmem,
perfstat.stats$sysstat newmem,
perfstat.stats$sysstat newdsk,
perfstat.stats$sysstat olddsk,
perfstat.stats$snapshot sn
where
newdsk.snap_id = sn.snap_id
and
olddsk.snap_id = sn.snap_id-1
and
newmem.snap_id = sn.snap_id
and
oldmem.snap_id = sn.snap_id-1
and
oldmem.name = 'sorts (memory)'
and
newmem.name = 'sorts (memory)'
and
olddsk.name = 'sorts (disk)'
and
newdsk.name = 'sorts (disk)'
and
newmem.value-oldmem.value > 0
group by
to_char(snap_time,'HH24')
;
Here is the output from the script. We can now take
this data and create a graph in a spreadsheet.
TO
SORTS_MEMORY SORTS_DISK
-- ------------ ------------
00 18,855
11
01 19,546
15
02 10,128
5
03 6,503
8
04 10,410
4
05 8,920
5
06 8,302
7
07 9,124
27
08 13,492
71
09 19,449
55
10 19,812
106
11 17,332
78
12 20,566
76
13 17,130
46
14 19,071
61
15 19,494
68
16 20,701
79
17 19,478
44
18 23,364
29
19 13,626
20
20 11,937
17
21 8,467
7
22 8,432
10
23 11,587
10
Figure 22-1 shows the plot from the output. Here you
see a typical increase in sort activity during the online period of
the day. Sorts rise at about 8:00
a.m. and then go down
after 6:00 p.m.
Figure 1: Average
memory sorts by hour of the day
Now, let’s run the script to compute the
averages by the day of the week.
rpt_avg_sorts_dy.sql
set pages
9999;
column sorts_memory format 999,999,999
column sorts_disk format 999,999,999
column ratio format
.99999
select
to_char(snap_time,'day') DAY,
avg(newmem.value-oldmem.value) sorts_memory,
avg(newdsk.value-olddsk.value) sorts_disk
from
perfstat.stats$sysstat oldmem,
perfstat.stats$sysstat newmem,
perfstat.stats$sysstat newdsk,
perfstat.stats$sysstat olddsk,
perfstat.stats$snapshot sn
where
newdsk.snap_id = sn.snap_id
and
olddsk.snap_id = sn.snap_id-1
and
newmem.snap_id = sn.snap_id
and
oldmem.snap_id = sn.snap_id-1
and
oldmem.name = 'sorts (memory)'
and
newmem.name = 'sorts (memory)'
and
olddsk.name = 'sorts (disk)'
and
newdsk.name = 'sorts (disk)'
and
newmem.value-oldmem.value > 0
group by
to_char(snap_time,'day')
;
Again, we will take the result set and plot it in a
chart. This time, let’s plot the disk sorts.
DAY
SORTS_MEMORY SORTS_DISK
--------- ------------ ------------
friday 12,545
54
monday 14,352
29
saturday 12,430
2
sunday 13,807
4
thursday 17,042
47
tuesday 15,172
78
wednesday 14,650
43
Figure 22-2 shows the
graph. In this database, the activity pattern on Tuesday shows a
large number of disk sorts, with another smaller spike on Thursdays.
For this database, the Remote DBA may want to pay careful attention to the
TEMP tablespaces on these days, and perhaps issue an alter
session command to create continuous extents in the TEMP
tablespace.
To do this, you may need a special command
using events:
rem Used to
clean up temp segments in temporary tablespace
rem x is the temp tablespace number in TS$ plus
rem
ALTER SESSION SET EVENTS 'IMMEDIATE TRACENAME DROP_SEGMENTS LEVEL
&x';
Figure 2: Average disk sorts by day of the
week
At the risk of being redundant, I need to
reemphasize that the single most important factor in the performance
of any Oracle database is the minimization of disk I/O. Hence, the
tuning of the Oracle sorting remains one of the most important
considerations in the tuning of any Oracle database.
This is an excerpt from "Oracle High-Performance SQL Tuning" by
Donald K. Burleson, published by Oracle Press.