Backup is a task that you will perform frequently. Restore should be a task you perform less frequently. The backup recommendations here and those from other environments might look similar. In the event of a disaster, the OVO server recovery includes the following three areas for full operations:
The operating system and user environment.
The application programs.
The network configuration.
The recovery concept seems very straightforward at first, look closer at what could be involved. If there is a file system failure, are you prepared for the time it will take to restore the system? Depending on the type of backup that you have performed, it could take you days or hours to recover. Operating system or application failures are often resolved with a patch fix. If the application fails after an upgrade, you could also make the decision to return to a prior release of the program until you work out the bugs in the test environment. What test environment? What does a test environment have to do with a backup? We will talk more about this in the next section of this chapter. The time to recovery also depends on the type of backup strategy you have adopted.
The procedure for backup and recovery should include a regular full-system backup. A full-system backup usually transfers the data to an offline storage media such as tape, CD or removable disk for off site storage. An outline of a backup procedure might include
In /etc/motd (the message of the day file), display the times the system will be down for maintenance, including scheduled backup times.
A few minutes before bringing the system down or starting the online backup, broadcast a message to all users that the system will be coming down for maintenance in a few minutes. Use the
/etc/wall command or the
opcwall command and give users time to finish what they are doing before they are logged off during system or process shutdown.
Offline "cold" backups should always be done in single-user mode to prevent users from accessing and using system files during the backup procedure. When bringing the system down, use the
shutdown command with a grace period option.
After the system is in single-user mode, execute a
sync command and check the files' system integrity with the
fsck command. Take corrective actions for any file system errors.
Before starting the backup, mount the specified file systems.
During the backup procedure, log the fsck activity, disk usage data, and commands executed during the backup to a log file and maintain a hardcopy.
A tape scan file from the tape utility can be generated in /tmp and printed for audit purposes.
An adequate supply of tapes to a tape pool should be made. Rotate the tapes on a regular basis and do not use tapes beyond their normal life and wear expectancy.
After the backup is complete, return the system to multi-user mode.
Consider storing the tapes at an offsite location, a precaution that protects backup in case of catastrophic events like fires, hurricanes, tornadoes, and floods.
Test the recovery procedure to ensure that the tapes will restore the system information.
Tools from multiple vendors support these backup methods. After you backup the OpenView data using the tools discussed in the next sections, it is still necessary to perform a standard backup of the operating system environment. The backup information provided in this section consists of general guidelines and is not OV-specific. It is recommended to check commercial off-the-shelf backup solutions, such as HP's DataProtector.
Many system administrators use cron or some other type of program to schedule regular system backups. The OpenView schedule template shown in Figure 17-1 is another method for executing a program on a regular schedule (except offline backups). The interface for the template is located within the Message Source Templates Window (
Window
Let us look at some of the tools to perform an OpenView backup.
The backup tools opc_backup and opc_restore can be executed from the command line. opc_backup can also be scheduled as part of the routine tasks for daily maintenance. The command line options are for a full backup of the configuration data stored on disk and runtime data stored in the Oracle database. Because some of the data is memory resident during operations, opc_backup requires the shutdown of both the server (including OVO, Oracle and NNM) and GUI processes. If the required downtime is not feasible within your OV operation, consider using the online backup method described in the next section of this chapter.
A brief summary of the
opc_backup and
opc_recover commands is provided here for reference purposes. Refer to the on line manual page (man opc_backup) for further details.
SYNOPSIS opc_backup [-c][-d <dir>...][-h][-n][-v][<backup device>] opc_recover DESCRIPTION: The command opc_backup saves the HP OpenView Operations (OVO) environment. By
default, opc_backup is used interactively but it can also be used non-interactively. The user can choose between two backup methods: 1. A full OVO backup. This saves the OVO and OpenView Installation, consisting of the whole OpenView directory tree and all data contained in the Oracle database openview. Note that parts which are not located in the /opt/OV or /var/opt/OV or / etc/opt/OV sub trees are not backed up. This means that the Oracle binaries are not backed up. 2. An OVO configuration backup only. This saves the entire ITO configuration, consisting of various ITO and OpenView configuration directories and all the data contained in the Oracle database openview. Note that a backup of the OpenView database includes both the currently active and the history messages. If you do not want to back up these messages to save space, acknowledge the active messages and download the history messages using the appropriate ITO administrator GUI functionality or the opchistdwn (1M) command before starting the backup. Before running opc_backup, make sure that no ITO user interface or any other ITO processes are accessing the database. The backup is written with the fbackup (1M) command. Symbolic links are NOT resolved, but saved as symbolic links. The command opc_backup performs the following steps: * If called in interactive mode, the user is asked for the backup method (full backup or configuration backup). * Checks for running ITO GUI processes. * If a full backup is applied and the management server acts as a managed node, the ITO agent processes are stopped. If the management server is in a clustered environment, you must make sure that no ITO agent process is running on any cluster client. * Stops the OpenView platform services, the ITO server processes and any other OpenView integrated products, by using the ovstop (1m) command. Make sure that no processes access the database either directly or over SQL*Net or Net8, from any other system. * Starts the SQL*Net or Net8 listener and the Oracle database, if they are not running. * Extracts all Oracle files needed for a complete backup of the database. This includes the data files, redo log files and con trol files. * Performs a shutdown of the Oracle instance. * If called in interactive mode, the user is asked for additional directories, which should also be backed up. The user can also choose to back up other directories in addition to the ITO configuration and database directories (for example, the directory with the downloaded history messages). Note that symbolic links are not resolved, but are saved as symbolic links. * Calculates and prints the approximate amount of free space needed for the backup on the backup device. * If called in interactive mode, the user is asked for the destination to which the backup will be written. You can specify the device file of a tape drive or a file on disk. A device file is assumed, if the backup destination starts with /dev/. * Writes the backup to the given destination, using the fbackup (1M) command. * Restarts the Oracle instance. * Restarts the OpenView platform services, the ITO server processes and any integrated products, by using ovstart (1m). * If a full backup is applied and the management server system is a managed node, the ITO agent processes are restarted. With opc_recover, you can restore any ITO configuration, which has been previously backed up using opc_backup. The recovery restores the saved ITO and OpenView directories and the openview database. The opc_recover script performs the following steps: * Asks the user whether the backup was applied with the full backup method or with the configuration backup method of opc_backup. (If a full backup was applied, the agent processes must be stopped and various directories must be cleared.) * If a full backup is applied and the management server acts as a managed node, all ITO agent processes are stopped. If the management server is in a clustered environment, you must make sure that no agent process is running on any cluster client. * Stops the OpenView platform services, the ITO server processes and any other OpenView integrated products, by using the ovstop (1m) command. Make sure that no processes access the database either directly or over SQL*Net or Net8, from any other system. * Shuts down the Oracle instance and stops the SQL*Net or Net8 listener. * Asks the user for the source from which the backup must be restored. It is possible to restore a backup from a tape or from a disk file. If the backup is restored from tape, the user is asked for the device file of the tape drive. If the backup is restored from disk, the user is asked for the path name of the file from which the backup must be restored. * Asks the user if the backup should be run in verbose mode. If verbose mode is specified, the restore command, fbackup (1M), is called with the verbose option and displays all restored files. * Clears the saved OpenView and ITO directories. This is to prevent inconsistencies between the information in the database and the information in the OpenView directories. If you restore from a full ITO backup, the whole /opt/OV, /var/opt/OV and /etc/opt/OV directory trees are cleared. If you restore from an ITO configuration backup, the following directories are cleared: - /var/opt/OV/share/databases - /etc/opt/OV/share/registration - /etc/opt/OV/share/conf - /var/opt/OV/share/tmp/OpC/mgmt_sv - /var/opt/OV/share/tmp/OpC/distrib * Restores the backed up directories: * Restarts the Oracle database processes and the SQL*Net or Net8 listener. * Restarts the OpenView platform services, the ITO server processes and any integrated products, by using ovstart (1m). * If a full backup is applied and the management server acts as a managed node, the ITO agent processes are restarted. files /var/opt/OV/log/OpC/opcbkup.log The transcript of the opc_backup script will also be written to this logfile. Each time opc_backup is called, it will be overwritten. /tmp/opcrec.log The transcript of the opc_recover script will also be written to this logfile. Each time opc_recover is called, it will be overwritten. SEE ALSO ORACLE Server - ADMINISTRATION, HP OpenView IT/Operations: Administrator's Reference, fbackup (1M), frecover (1M), opc_backup (5), opcdbreorg (1m), ovstart (1m), ovstop (1m), opc (5)
The ovbackup.ovpl and ovrestore.ovpl programs make it possible to perform a backup while the OVO GUI and server processes are running. This method is sometimes called the "Hot Backup." This is accomplished by placing the OV processes in a temporarily (only takes a few seconds) paused state and copying the (memory resident, operational and analytical) data to disk. The backup includes the databases of both NNM and OVO configuration data, but not the OV software. Internally, the OVO backup scripts are called by the ovbackup.ovpl program. The OV backup scripts take advantage of Oracle online backup techniques (described in more detail in Chapter 18, "Oracle for OpenView").