This chapter focuses on the issues that must be
considered when designing for Real Application
Cluster (RAC). The reasons for utilizing RAC
must be well understood before a proper
implementation can be achieved. These are the
key reasons to use RAC:
-
Spread the CPU load across multiple servers
-
Provide high availability (HA)
-
Take advantage of larger SGA sizes than can
be accommodated by a single instance
commodity server
-
Scalability
Conversely, there are cases where RAC may not be
an appropriate design option.
It would be wise for both technical and
non-technical team members to keep the following
in mind when considering RAC.
A high availability RAC design must have no
single point of failure, a transparent
application failover, and reliability.
Failure of the local data center must
also be considered.
A high availability design requires
attention to equipment, software, and the
network.
The following sections provide a look into two
key design considerations. The first is the
design of the equipment needed to support a HA
RAC configuration. Next, the methods of
configuring RAC instances in a RAC cluster to
meet performance and HA requirements will be
addressed.
Designing Equipment for Real Application
Clusters
The most important design feature of the
equipment used in HA RAC clusters is an
architecture that eliminates any single point of
failure (SPF). The diagram in Figure 4.1
implements a number of design flaws that does
not adhere to the definition of high
availability.
Figure 4.1:
Non-Redundant Configuration
Figure 4.1 shows a RAC configuration. However,
this configuration, other than the RAC cluster
itself, has no redundancy and many single points
of failure. The single points of failure are:
-
Firewall
-
Application Server
-
Fabric Switch
-
SAN array
A failure of any one of these single points will
result in unscheduled downtime, no matter how
well the RAC cluster is designed and tuned.
It is critical to ensure that there is no single
point of failure in a high availability
configuration. Figure 4.2 illustrates exactly
what eliminating single points of failure means.
Figure 4.2:
Example of a Redundant RAC Configuration
The system shown in Figure 4.2 has had the
following redundancies added:
-
Second firewall with an independent
connection to the web
-
Second application server
-
Second fabric switch
with redundant pathways
-
Second SAN array
-
Set of load balancers
-
Geo-remote RAC Guard configuration
Now the single points of failure in Figure 4.1
have been eliminated. A third server has also
been added as well as a SAN array in a
geographically remote location. This third
server and SAN ensure that not even a disaster
at the primary location will bring the
application down.
The application server and firewall for
this third server are not shown and may not be
required if the firewalls and application
servers are in a different location from the
database servers.
In addition, the SAN, perhaps a
Hitachi, EMC Clariion or
EMC Symmetrix, should be configured using
redundant disk configurations such as RAID
-1 or RAID-5.
It should be stressed that application
performance can suffer horribly from a disk
failure during either a disk rebuild with
installed spares or a rebuild of the information
using parity information from the other disks in
a RAID-5 set.