Using the LTOM utility
Oracle introduced LTOM, the Oracle Lite Onboard
Monitor (written by Carl Davis of the Oracle
Center Of Expertise) as an important new
proactive performance monitor tool
for the senior Oracle DBA.
LTOM is free and can be downloaded from
MetaLink at
this link.
LTOM joins the list of supplemental monitors
that provide external server-side information
about disk, RAM, network, and CPU influences on Oracle
performance.
Oracle LTOM is unlike the reactive Oracle tuning
tools that alert the DBA only after the database
has already experienced a slowdown. Rather, LTOM
is a proactive tool, collecting real-time data,
as well as data from vmstat, and enabling a
detailed trace mechanism. LTOM provides
real-time automatic problem detection and data
collection.
WARNING
- LTOM is not for beginners.
The Metalink Note
352363.1 says that LTOM is an
"Embedded Real-Time Data
Collection and Diagnostics
Platform" and explicitly notes
that LTOM is only for use by
experienced Oracle database
administrators.
|
What is LTOM?
Oracle LTOM (Lite Onboard Monitor) is described
as an OS independent (Java front-end) tool that
works to trigger detailed trace collection
whenever a LTOM user-defined threshold event
(non-idle wait event and/or CPU usage) occurs.
The Lite Onboard Monitor is a java program
designed as a real-time diagnostic platform for
deployment to a customer site.
LTOM runs on the customer's
UNIX server,
is tightly integrated with the host operating
system and provides an integrated solution for
detecting and collecting trace files for system
performance issues.
The ability of LTOM to detect problems and
collect data in real-time will hopefully reduce
the amount of time it takes to solve problems
and reduce customer downtime.
LTOM Features
The new Oracle LTOM tool has the following
features, centered around the concept of
threshold-based data recording, i.e. trace files
LTOM creates no footprint on the database. All
data is written to ascii text files - either
oracle session trace files located in the udump
or to a specific log file associated with the
respective service that is being used, i.e.
manual recorder, auto recorder, hang detection
or session recorder. The manual recorder writes
vmstat, mpstat and top command info to an ascii
log file.
The session recorder uses an in-memory trace
buffer for the 10046 trace. Sessions are traced
in-memory until they violate either a CPU or
wait event rule
and, at that time, the contents of the
memory buffer is dumped to disk.
LTOM Wait Event Rules
LTOM implements a rule-based approach to allow
the DBA to specify collection-triggering
threshold rules based on the scalar values for
Oracle non-idle wait events.
LTOM External Data Recording
LTOM notes the major shortcoming of Statspack
and its inability to gather data about the
external server environment such as disk
enqueues, CPU enqueues, and RAM paging.
One of the problems with relying solely on
Statspack is the inability to look at
performance from a holistic point of view.
Information about non-Oracle processes and the
health of the operating system in terms of
memory, CPU and I/O, for example, is not
collected. LTOM also addresses the issue with
deriving high-detail from hourly Statspack
snapshots when more frequent elapsed-time
metrics are needed.
Further, all static data collectors are
problematic in that single sample snapshots or
multiple snapshots taken at 15 or 30-minute
intervals can miss problems which can occur
briefly during a snapshot interval and will be
averaged out over the duration of the snapshot.
The data for the LTOM in-RAM
data repository includes data from both the
UNIX/Linux top and vmstat commands.
Note that many Oracle professionals have
implemented external scripts to capture
UNIX/Linux vmstat information.
LTOM Automatic Data Recording
LTOM has a rule definition component called
automatic data recording that allows setting of
thresholds by providing specific values for
non-idle wait events.
When the LTOM thresholds are triggered,
data collection is enabled.
LTOM allows the definition of rules for external
CPU thresholds.
This is important because many 64-bit
databases become CPU-bound with large RAM regions.
This CPU tracing (recording amount of CPU
used) is also important if the SQL optimizer
(CBO) has been changed to consider CPU by
setting the
_optimizer_cost_model=cpu
parameter.
See
these
notes for turning on Oracle CPU SQL costing.
LTOM Automatic Session Tracing
LTOM has a method to collect the session_id for
offending SQL statements and a method to fire a
10046 SQL trace dump.
LTOM uses the Oracle extended SQL*Trace
utility, turning on a 10046 (super-detailed)
trace on a target SQL statement.
Automatic Session Tracing uses a set of rules to
determine when to turn on SQL trace for
individual oracle sessions, using event 10046
level 12 trace.
In sum, LTOM is an exciting, proactive Oracle
utility that overcomes many of the problems with
existing
reactive
database monitors. In addition to system
performance, the DBA must be aware of what is
happening to his datafiles on the Oracle server.
He needs to know how large the files are,
how much space is available, and if the files
have been corrupted.
File analysis
Equally important as server performance is
server integrity.
Not only must the DBA have a good backup
and recovery strategy, which will be covered
later in this book, but she must also have the
ability to detect and repair file corruption
when it occurs.
 |
Fo r more details on Oracle utilities, see the book "Advanced
Oracle Utilities" by Bert Scalzo, Donald K. Burleson, and Steve Callan.
You can buy it direct from the publisher for 30% off directly from
Rampant TechPress.
|