Clustered systems have provided a high-availability, high-scalability solution since initially appearing in the 1980s in the DEC VAXcluster configuration. Clusters can combine all the components of separate machines, including CPUs, memory, and I/O subsystems, into a single hardware entity. However, clusters are typically built by using shared disks linked to multiple "nodes" (computer systems). An interconnect between systems provides a means of exchanging data and instructions without writing to disk (see Figure 11-3). Each system runs its own copy of an operating system and Oracle instance. Grids, described later in this chapter, are typically made up of a few very large clusters.
Oracle's support for clusters dates back to the VAXcluster. Oracle provided a sophisticated locking model so that the multiple nodes could access the shared data on the disks. Clusters required such a locking model, because each machine in the cluster had to be aware of the data locks held by other, physically separate machines in the cluster.
Today, that Oracle solution has evolved into Real Application Clusters or RAC (replacing the Oracle Parallel Server (OPS) that was available prior to Oracle9i). RAC is most frequently used for Windows, Linux, or Unix-based clusters. Oracle provides an integrated lock manager that mediates between different servers, or nodes, that seek to update data in the same block.
Real Application Clusters introduced full support of Cache Fusion; with Cache Fusion, locks are maintained in memory without frequent writing to disk. Cache Fusion is different from the standard locking mechanisms that are described in Chapter 7, in that it applies to blocks of data, rather than rows. The mediation is necessary because two different nodes might try to access different rows in the same physical block, which is the smallest amount of data that can be used by Oracle.
Cache Fusion greatly increased performance for read/write operations in Oracle8i OPS and added write/write operations to Cache Fusion in Oracle9i RAC. Oracle Database 10g further speeds performance by leveraging Infiniband networks through support of SDP (Sockets Direct Protocol) and asynchronous I/O protocols, lighter weight transports than used in previous, traditional TCP/IP-based RAC implementations.
Prior to Real Application Clusters, you could configure clusters to deliver higher throughput or greater availability for the system. In the high-availability scenario, if a single node fails, a secondary node attached to the shared disk can get access to the same data. Queries can run to completion without further intervention through client failover, first appearing in Oracle8. Real Application Clusters, due to its increased performance, can deliver both availability and scalability, as each node in a cluster acts as a failover for all the other nodes in the cluster.
Real Application Clusters are being used more frequently in commodity server environments, where clusters of low-priced servers are seeking to deliver the performance of more expensive SMP machines. For simple failover on entry platforms, Oracle also bundles Fail Safe. Data is not shared by the two systems in a Fail Safe configuration, and a second system provides standby access to this data. However, because concurrent access isn't provided, the Fail Safe solution doesn't offer the scalability that Real Application Clusters can provide. The use of clusters for high availability (both with and without Real Application Clusters) is discussed in Chapter 10.