This section provides only a very brief overview of standard backup and recovery options. For more detailed information about backup and recovery options, refer to Chapter 10.
Even if you've taken adequate precautions, critical database records can sometimes be destroyed as a result of user error or hardware or software failure. The only way to prepare for this type of potentially disastrous situation is to perform regular backup operations.
Two basic types of potential failures can affect an Oracle database: instance failure, in which the Oracle instance terminates without going through the shutdown process; and media failure, in which the disks that store the information in an Oracle database are corrupted or damaged.
After an instance failure, Oracle will automatically perform crash recovery; you can use Real Application Clusters (Oracle Parallel Server prior to Oracle9i) to automatically perform instance recovery when one of its instances crashes. However, DBAs must initiate recovery from media failure. The ability to recover successfully from this type of failure is one of the greatest challenges a DBA facesit's also the place where the value of the DBA becomes most apparent! The recovery process includes restoring older copies of the damaged datafile(s) and rolling forward by applying archived and online redo logs.
To ensure successful recovery, the DBA should have prepared for this eventuality by performing the following actions:
Multiplexing online redo logs by having multiple log members per group on different disks and controllers
Running the database in ARCHIVELOG mode so that redo log files are archived before they are reused
Archiving redo logs to multiple locations
Maintaining multiple copies of the control file(s)
Backing up physical datafiles frequentlyideally, storing multiple copies in multiple locations
Running the database in ARCHIVELOG mode ensures that you can recover the database up to the time of the media failure; in this mode, the DBA can perform online datafile backups while the database is available for use. In addition, archived redo logs can be sent to a standby database (explained in Chapter 10) in which they may be applied.
RMAN, first introduced in Oracle8 and greatly enhanced since, provides an easy-to-use frontend to manage this process. RMAN is accessible today through EM interfaces.
There are two major categories of backup:
Full backup
Includes backups of datafiles, datafile copies, tablespaces, control files (current or backup), or the entire database (including all datafiles and the current control file). Reads the entire file and copies all blocks into the backup set, skipping only datafile blocks that have never been used (with the exception of control files and redo logs where no blocks are skipped).
Includes backups of datafiles, tablespaces, or the whole database. Reads the entire file and backs up only those data blocks that have changed since a previous backup.
You can begin backups through the Recovery Manager (RMAN) or the Oracle Enterprise Manager interface to RMAN, which uses the database export facility, or you can initiate backups via standard operating system backup utilities.
RMAN was introduced with Oracle8 and replaced the Enterprise Backup Utility available for some previous Oracle7 releases. In general, RMAN supports the most database backup features, including open or online backups, closed database backups, incremental backups at the Oracle block level, corrupt block detection, automatic backups, backup catalogs, and backups to sequential media. RMAN added capabilities in Oracle9i for one-time backup configuration, recovery windows to determine and manage expiration dates of backups, and restartable backups and restores. Also added was support for testing of restores and recovery.
In Oracle Database 10g, RMAN can perform image copy backups of the database, tablespaces, or datafiles. RMAN can be used to apply incremental backups to datafile image backups. The speed of incremental backups is increased through a change tracking feature by reading and backing up only changed blocks.
Recovery options include the following:
Complete database recovery to the point of failure
Tablespace point-in-time recovery (recovery of a tablespace to a time different from the rest of the database)
Time-based or point-in-time database recovery (recovery of the entire database to a time before the most current time)
Recovery until the CANCEL command is issued
Change-based or log sequence recovery (to a specified System Change Number, or SCN)
You can recover through the use of RMAN (utilizing the recovery catalog or control file) or via SQL or SQL*Plus.
RMAN in Oracle Database 10g improves the reliability of backups and restores through a number of added features. The current database supports the backup and restore of standby control files. RMAN now automatically retries a failed backup or restore operation. During recovery, RMAN automatically creates and recovers datafiles that have never been backed up. Where backups are missing or corrupt during the restore process, RMAN now automatically uses an older backup.
To speed backups and restoration, Oracle Database 10g introduces the Flash Recovery Area, organizing recovery files to a specific area on disk. These files include control files, archived log files, flashback database logs, datafile copies, and RMAN backups. You can set a RETENTION AREA parameter to retain needed recovery files for specific time windows. Backup files and archivelogs that age beyond the time window are automatically deleted. Oracle Database 10g's ASM (described earlier in this chapter) can configure the Flash Recovery Area. If availability of disk space is an issue, RMAN in Oracle Database 10g also has the ability to compress backup sets.
A number of Oracle Backup Solutions Program (BSP) partners certify their products to perform backup and recovery to disk and tape storage devices using RMAN. Oracle bundles Legato's Single Server Version (LSSV) tape storage management product at no charge on popular Unix platforms (such as AIX, HP-UX, Linux, Solaris, and Tru64 Unix) and Windows platforms. The LSSV's capabilities include the following:
Media management, including tape labeling, media tracking, and retention policy management
Ability to use up to four locally connected tape drives and up to four concurrent data streams
Installation integrated into the Oracle installer
Integration with EM for administration
There are some limitations when using LSSV. The product doesn't provide networked backups; it provides support for only a limited number of tape devices; and it doesn't support robotic libraries or backups of operating system files, network attached storage (NAS), or storage area networks (SAN). Legato offers optional products providing these additional backup capabilities.
There are other Oracle BSP partners, such as Computer Associates, EMC, Hewlett-Packard, Sun, IBM Tivoli, and Veritas. Current partners are generally posted at http://otn.oracle.com (the Oracle Technology Network). You may want to check with your favorite backup vendor about their current certification in support of Oracle's Media Management Interface Library (MML), the interface to RMAN.