 |
|
Overview of the Alert Scripts
Oracle Tips by Burleson Consulting
|
The scripts presented may
require customization to meet the requirements of your environment.
For example, there are dozens of dialects of UNIX, and dozens of ways
to issue UNIX commands. It is recommended that you become familiar
with shell programming before attempting to modify any of the scripts.
For example, the command to determine the free
space in your database server environment is different in many
dialects of UNIX. If you understand shell programming, you can
customize the script to handle a difference in dialect. In the
following example, we customize the free space command for an Oracle
server alert script. Note the use of the
$os variable, which is set to
the appropriate dialect of UNIX. The script generates the free space
commands for IRIX64, AIX, OSF1, and HP/UX.
# This code is because bdf and df -k display
free space in different columns
if [ $os = "IRIX64" ]
then
arch_dir_mp=`${dialect_df} $LOG_ARCHIVE_DEST|grep -v kbytes|awk
'{ print $7 }'`
arch_free_space=`${dialect_df} ${arch_dir_mp}|grep -v kbytes|awk
'{ print $3 }'`
fi
if [ $os = "AIX" ]
then
arch_dir_mp=`${dialect_df} $LOG_ARCHIVE_DEST|grep -v blocks|awk
'{ print $7 }'`
arch_free_space=`${dialect_df} ${arch_dir_mp}|grep -v blocks|awk
'{ print $3 }'`
fi
if [ $os = "OSF1" ]
then
arch_dir_mp=`${dialect_df} $LOG_ARCHIVE_DEST|grep -v blocks|awk
'{ print $7 }'`
arch_free_space=`${dialect_df} ${arch_dir_mp}|grep -v blocks|awk
'{ print $3 }'`
fi
if [ $os = "HP-UX" ]
then
arch_dir_mp=`${dialect_df} $LOG_ARCHIVE_DEST|grep -v kbytes|awk
'{ print $6 }'`
arch_free_space=`${dialect_df} ${arch_dir_mp}|grep -v kbytes|awk
'{ print $4 }'`
fi
Now, let's explore some of the great
information that you can extract from Oracle using these scripts.
Customized Exception Alert
Reports for the Remote DBAThree categories of reports are most commonly scheduled
by the Remote DBA. Together, the reports give the Remote DBA a complete picture of
database activity. Each of these scripts can be found at the Oracle
Press web site at www.oraclepressbooks.com.
This
comprehensive set of scripts provides a complete picture of the entire
database (see Figure 14-1).
Figure 14-113:
The alert architecture for Oracle
Let’s begin by visiting each script briefly to
see how it functions.
Daily STATSPACK Alert Report
The daily STATSPACK alert report is
indispensable for any Remote DBA. The script is generally run daily to tell
the Remote DBA about any exceptional conditions within the database. The
report has two script components:
·
statspack_alert.ksh This
is the Korn shell script to launch and e-mail the report. You can
customize the script to send e-mail alerts to anyone within your
organization.
·
statspack_alert.sql This
report uses the standard STATSPACK tables and reports on out-of-bounds
conditions. Note that the alert threshold percentages can be adjusted
depending on your needs. Most Remote DBAs schedule this report to run every
day, and to report on the past week. A special modified version of
this script, called statspack_alert80.sql, is available for
databases that are not on Oracle8i.
To appreciate the value of the report, let’s
examine the component output from the script.
Hot File Reads Alert
The script that follows reports on files whose
reads are greater than 25 percent (or 50 percent, or 75 percent) of
total reads. The code compares the individual I/O for a file from
stats$filestatxs with the overall I/O for the period in stats$sysstat.
When you find a “hot” file, you may want to place it in the KEEP pool
or to stripe it across multiple disks.
Hot File Writes Alert
The script that follows alerts you to files
whose write I/Os are greater than 25 percent (or 50 percent, or
75 percent) of total writes. This information can help the Remote DBA locate
files that are consuming more than a normal proportion of I/O writes.
You may want to place such files in the KEEP pool or to stripe them
across multiple disks.
***********************************************************
This will identify any single file with a write I/O
more than 50% of the total write I/O of the database.
The "hot" file should be examined, and the hot table/index
should be identified using STATSPACK.
- The busy file should be placed on a disk device with
"less busy" files to minimize write delay and channel
contention.
- If small file has a hot small table, place the table
in the KEEP pool
***********************************************************
Yr. Mo Dy Hr. FILE_NAME WRITES PCT_OF_TOT
---------------- ------------------------------ ----------
----------
2001-02-05 12 /u02/oradata/bookd01.dbf 1268
38
2001-02-05 13 /u02/oradata/bookd01.dbf 1371
52
2001-02-05 17 /u02/oradata/bookd01.dbf 1489
58
2001-02-05 20 /u02/oradata/bookd01.dbf 807
37
2001-02-05 23 /u02/oradata/bookd01.dbf 840
56
Data Buffer Hit Ratio Alert
The Data Buffer Hit Ratio report alerts the Remote DBA
to times when the data buffer hit ratio falls below the preset
threshold. It is very useful for locating times when decision support
type queries are being run, because a large number of large-table full
table scans will make the data buffer hit ratio drop. The script also
reports on all three data buffers, including the KEEP and RECYCLE
pools. It can be customized to report on individual pools, because the
KEEP pool should always have enough data blocks to cache all table
rows, and the RECYCLE pool should get a very low buffer hit ratio. If
the data buffer hit ratio is less than 90 percent, you may want to
increase db_block_buffers, buffer_pool_keep, or
buffer_pool_recycle. Also note that the Oracle 8.0 version of this
alert is available.
Disk Sorts Alert
If disk sorts are greater than 100/hr, you may
want to increase sort_area_size or tune SQL to perform index
scans. The Disk Sorts report is very useful for monitoring the amount
of activity against the TEMP tablespace, and it is also useful for
ensuring that sort_area_size is set to an optimal level. As a
general rule, increasing sort_area_size reduces the number of
disk sorts, but huge sorts will always need to be performed on disk in
the TEMP tablespace.
***********************************************************
When there are high disk sorts, you should investigate
increasing sort_area_size or adding indexes to force index_full
scans
***********************************************************
TO_CHAR(SNAP_ SORTS_MEMORY SORTS_DISK RATIO
------------- ------------ ---------- ---------------
2001-01-27 08 6,731 144 .0213935522211
2001-01-27 09 12,532 155 .0123683370571
2001-01-27 10 10,313 147 .0142538543586
2001-01-27 20 6,615 102 .0154195011338
2001-01-28 11 8,574 137 .0159785397714
2001-01-29 04 19,979 111 .0055558336253
2001-01-29 10 13,650 126 .0092307692308
2001-01-29 19 24,293 186 .0076565265714
2001-01-29 21 5,753 102 .0177298800626
2001-01-30 11 17,895 422 .0235820061470
2001-01-31 13 11,821 154 .0130276626343
2001-02-02 14 10,936 129 .0117959034382
2001-02-02 20 13,443 153 .0113813880830
2001-02-02 21 31,355 260 .0082921384149
2001-02-03 05 6,252 135 .0215930902111
2001-02-03 06 6,285 145 .0230708035004
2001-02-03 07 6,313 139 .0220180579756
I/O Wait Alert
The code that follows interrogates the
stats$filestatxs.wait_count column to report on any files with an
excessive amount of wait activity. If the number of I/O waits appears
excessive, the Remote DBA needs to investigate the cause of the waits. High
I/O waits on files are commonly associated with buffer busy waits, and
may be caused by tables with too few freelists.
***********************************************************
When there are high I/O waits, disk bottlenecks may exist.
Run iostats to find the hot disk and shuffle files to
remove the contention
***********************************************************
SNAPDATE FILENAME WAITS
---------------- ---------------------------------------- ----------
2001-01-28 23 /u03/oradata/PROD/applsysd01.dbf 2169
/u04/oradata/PROD/applsysx01.dbf 1722
/u03/oradata/PROD/rbs01.dbf 2016
2001-01-30 16 /u03/oradata/PROD/mrpd01.dbf 1402
2001-01-31 23 /u03/oradata/PROD/applsysd01.dbf 4319
/u04/oradata/PROD/applsysx01.dbf 3402
/u03/oradata/PROD/rbs01.dbf 3012
Buffer Busy Wait Alert
Whenever you see buffer busy waits, you have a
condition where a data block is in the data buffer but is unavailable.
This type of contention is usually for a segment header block of a
high-level index node block. Adding freelists for the object often
corrects such wait conditions.
***********************************************************
Buffer Busy Waits may signal a high update table with too
few freelists. Find the offending table and add more freelists.
***********************************************************
MYDATE NAME BUFFER_BUSY_WAIT
------------- -------------------- ----------------
2001-01-18 13 DEFAULT 33
2001-01-28 12 DEFAULT 120
2001-01-29 03 DEFAULT 14
2001-02-14 16 DEFAULT 825
2001-02-21 10 DEFAULT 332
2001-02-28 09 DEFAULT 44
Redo Log Space Requests Alert
If redo log space requests are greater than 0,
you may want to increase the log_buffer init.ora parameter. A
high number of redo log space requests indicates a high level of
update activity, and the Oracle log buffer is having trouble
keeping up with the volume of redo log images.
***********************************************************
High redo log space requests indicate a need to increase
the log_buffer parameter
***********************************************************
TO_CHAR(SNAP_ REDO_LOG_SPACE_REQUESTS
------------- -----------------------
2001-01-17 23 31
Chained Row Alert
When the table fetch continued row
parameter is greater than 10,000/hr, you may have row chaining because
PCTFREE is set too low. The table fetch continued row can also
be triggered by reading data blocks with long columns that exceed the
block size. This is common with rows that contain RAW, LONG RAW, NCLOB,
CLOB, or BLOB datatypes.
***********************************************************
Table fetch continued row indicates chained rows, or fetches of
long datatypes (long raw, blob)
Investigate increasing db_block_size or reorganizing tables
with chained rows.
***********************************************************
TO_CHAR(SNAP_ TABLE_FETCH_CONTINUED_ROW
------------- -------------------------
2001-01-27 20 29,523
2001-01-27 22 45,338
2001-01-28 00 45,224
2001-01-28 14 44,522
2001-01-29 13 73,350
2001-01-29 14 62,689
2001-01-30 00 41,660
2001-02-01 09 16,308
2001-02-01 14 48,415
2001-02-01 15 56,480
2001-02-01 16 77,914
2001-02-01 17 66,382
2001-02-01 18 59,813
2001-02-01 19 57,564
2001-02-03 04 21,229
Shared Pool Contention Alert
Enqueue deadlocks can indicate contention
within the shared pool and locking related problems. Enqueue deadlocks
are associated with the “deadly embrace” condition, where one task is
locking resources and another task that is holding resources requests
a lock on the resources of the first task. To prevent such tasks from
waiting forever, Oracle aborts the tasks that requested the lock that
caused the deadly embrace.
***********************************************************
Enqueue Deadlocks indicate contention within the Oracle
shared pool.
Investigate increasing shared_pool_size
***********************************************************
TO_CHAR(SNAP_ ENQUEUE_DEADLOCKS
------------- -----------------
2001-01-28 23 23
2001-02-19 20 341
2001-02-29 01 47
2001-03-04 14 963
2001-03-25 08 55
Full Table Scan Alert
Long-table full table scans are legitimate only
when the query requires access to more than 40 percent of ordered
table rows and more than 7 percent of unordered table rows. Excessive
large-table full table scans may indicate poorly tuned SQL that is not
using an index.
***********************************************************
Long-table full table scans can indicate a need to:
- Make the offending tables parallel query
(alter table xxx parallel degree yyy;)
- Place the table in the RECYCLE pool
- Build an index on the table to remove the FTS
To locate the table, run access.sql
See Oracle Magazine September 2000 issue for details
***********************************************************
TO_CHAR(SNAP_ FTS
------------- ------------
2001-01-29 08 17,467
2001-01-29 09 8,570
2001-01-29 10 9,008
2001-01-29 11 8,700
2001-01-29 12 3,157
2001-01-29 15 1,599
2001-01-30 15 2,666
2001-02-02 09 1,759
2001-02-02 11 1,423
2001-02-02 12 1,801
2001-02-02 15 2,417
2001-02-02 17 1,024
2001-02-02 18 1,171
Background Wait Alert
The query that follows interrogates the
stats$bg_event_summary table to find events with high waits. When
background events experience more than 100 timeouts/hr, you may have a
locking problem.
***********************************************************
Excessive waits on background events
***********************************************************
Yr Mo Dy Hr EVENT tot waits time wait
timeouts
------------- ------------------------- --------- ---------
--------
2001-01-29 19 latch free 143 0
142
System Waits Alert
The query that follows interrogates the
stats$system_event table to locate events where there are excessive
waits. If you experience waits on latch free or enqueue, LGWR waits,
or buffer busy waits, you need to locate the cause of the contention.
***********************************************************
Excessive event waits indicate shared pool contention
***********************************************************
Yr Mo Dy Hr EVENT WAITS AVG_WAIT_SECS
------------- ------------------------- --------
-------------
2001-01-31 11 SQL*Net message from client 95,687
1
2001-01-31 11 SQL*Net more data from client 776
0
2001-01-31 11 enqueue 997
0
2001-01-31 11 latch free 2,399
0
2001-01-31 12 SQL*Net message from client 99,974
1
2001-01-31 12 SQL*Net more data from client 992
0
2001-01-31 12 enqueue 1,071
0
2001-01-31 12 latch free 2,649
0
Library Cache Misses Alert
The query that follows interrogates the
stats$librarycache table to look for excessive library cache miss
ratios. When the library cache miss ratio is greater than 0.02, you
may want to increase shared_pool_size.
***********************************************************
Excessive library cache miss ratio
***********************************************************
Cache Misses
MYDATE execs While Executing LIBRARY_CACHE_MISS_RATIO
------------- ---------- --------------- ------------------------
2001-01-18 13 2,835 2,122 .02027262
2001-01-24 02 3,381 1,292 .08028927
2001-01-28 15 3,572 2,002 .04111982
2001-01-29 09 121 65 .07073563
Database Writer Contention alert
The query that follows looks at the
stats$sysstat table for values in summed dirty queue length, write
requests, and DBWR checkpoints. When the write request length is
greater than 3, or you have excessive DBWR checkpoint waits, you need
to look at tuning the database write processes.
***********************************************************
Excessive length of DBWR processes
***********************************************************
Yr Mo Dy Hr Write request length Write Requests DBWR
checkpoints
------------- -------------------- ------------ ------------
2001-01-31 11 44,201 95,687 12
2001-02-12 14 2,846 3,028
141
Data Dictionary Miss Ratio Alert
The query that follows looks at the
stats$rowcache_summary table to compute data dictionary gets, data
dictionary cache misses, and the data dictionary hit ratio. The script
alerts the Remote DBA to times when requests for data dictionary metadata are
high. The problem can sometimes be relieved by increasing the
shared_pool_size init.ora parameter.
***********************************************************
Data Dictionary Miss Ratio below 90% indicates the need
to increase the shared_pool_size
***********************************************************
Data
Data
Data Dictionary
Dictionary
Dictionary Cache Hit
Yr Mo Dy Hr Gets Misses
Ratio
------------- --------- ------------
----------
2001-01-28 05 516,495 55,923 89
2001-01-31 07 753,417 81,438
89
2001-02-03 02 437,913 45,087
89
Data Dictionary Object Alert
The Data Dictionary Object report looks into
the stats$rowcache_summary table to find individual parameters that
experience a low hit ratio. The report can reveal internal contention
with the Oracle data dictionary and times of high dictionary metadata
requests.
***********************************************************
Report when Data Dictionary Hit Ratio for an object
falls below 70%
***********************************************************
Data Dictionary Data
Object
Dictionary Cache
Dictionary Hit
Yr Mo Dy Hr PARAMETER Gets Misses Usage
Ratio
---------------- --------------- ----------- ----------- ----------
------
2001-01-28 05 dc_free_extents 954 318
313 67
2001-02-01 03 dc_used_extents 363 338
330 7
2001-02-03 02 dc_free_extents 638 318
314 50
Now that we have covered the STATSPACK alert
report, let’s look at an equally important daily report on the
database servers and web servers.
This is an excerpt from "Oracle9i
High Performance tuning with STATSPACK" by Oracle Press.
|
|
|
|
Get the Complete
Oracle Tuning Details
The landmark book
"Oracle
Tuning: The Definitive Reference Second Edition" has been
updated with over 1,500 pages of expert performance tuning tips.
It's packed with scripts and tools to hypercharge Oracle 11g
performance and you can
buy it for 40% off directly from the publisher.
|
|
|
|
|
 |
Expert Remote DBA
BC is America's oldest and largest Remote DBA Oracle support
provider. Get real Remote DBA experts, call
BC Remote DBA today. |
 |
|