Oracle Essentials [Electronic resources] : Oracle Database 10g, 3rd Edition نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Oracle Essentials [Electronic resources] : Oracle Database 10g, 3rd Edition - نسخه متنی

Jonathan Stern

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید










10.5 Complete Site Failure


Protection from the complete failure of
your primary Oracle site poses significant challenges. Your company
must carefully evaluate the risks to its primary site. These risks
include physical and environmental problems as well as hardware
risks. For example, is the data center in an area prone to floods,
tornadoes, or earthquakes? Are power failures a frequent occurrence?
Previous versions of this book treated events such as
"a terrorist attack or an airplane crash into the
data center" as remote possibilities, but,
unfortunately, these scenarios no longer seem so implausible.

Protection from primary site failure involves monitoring of and
redundancy controls for the following:

Data center power supply

Data center climate control facilities

Database server redundancy

Database redundancy

Data redundancy


The first three items on the list are aimed at preventing the failure
of the data center. Data server redundancy, through simple hardware
failover or Real Application Clusters, provides protection from node
failure within a data center but not from complete data center loss.

Should the data center fail completely, the last two
itemsdatabase redundancy and data redundancyprovide for
disaster recovery.


Emerging Technologies: Clusters Across a Distance


Some vendors are now offering
clustering solutions that allow the nodes of the cluster to be
separated by enough distance to allow one node to survive the failure
of the data center that contains the other node. In fact, it is
anticipated that many grid computing deployments will occur this way
in the future. The clustering of nodes separated by a few kilometers
is becoming possible using sophisticated interconnect technologies
that can function over greater distances. The disks are mirrored with
a copy at each site to allow each site to function in the event of a
complete failure of the other site.

These solutions are intriguing because they can provide data server
redundancy and data center redundancy in a single solution. If one
node (or the data center containing it) fails, the node in the other
data center provides failover.


10.5.1 Oracle Data Guard: Standby Database for Redundancy


Oracle's physical
standby database functionality was introduced in
Oracle 7.3 to provide database redundancy. In
Oracle9i, this concept was extended to include
support for a logical standby database. The enhanced feature set is
called
Oracle Data
Guard.

The concept of a physical standby database is simplekeep a
copy of the database files at a second location, ship the redo logs
to the second site as they are filled, and apply them to the copy of
the database. This process keeps the standby database
"a few steps" behind the primary
database. If the primary site fails, the standby database is opened
and becomes the production database. The potential data loss is
limited to the transactions in any redo logs that have not been
shipped to the standby site. Figure 10-10 illustrates the standby
database feature.


Figure 10-10. Standby database


The physical standby database can be opened only for read-only
access. You can use read-only access to offload reporting, such as
end-of-day reports, from the primary server to the standby server.
The ability to offload reporting requests provides flexibility for
reporting and queries and can help performance on the primary server
while making use of the standby server.

While the standby database is being used for reporting, the archived
redo information from the primary site couldn't be
applied prior to Oracle Database 10g. Recovery
can continue when the standby database is closed again. This factor
has important implications for the time it will take to recover from
an outage with the standby database. If the primary site fails while
the standby database is open for reporting, the archived redo
information from the primary site that accumulated while the standby
database was querying must be applied before the standby is brought
online. This application of archived redo information increases the
duration of the outage. You'll need to weigh the
benefits of using the standby database for reporting against the
recovery time and the duration of the outage should a failure occur
while archived redo information is not being applied at the standby.
Oracle Database 10g introduces a real-time
feature enabling redo data to be applied at the standby as soon as it
is received.

Once a standby database is opened for read/write access, as opposed
to read-only access, it can no longer be used as a standby database
and you cannot resume applying archived redo information later. The
standby database must be
"re-cloned" from the primary site
if it is opened accidentally in read/write mode.

10.5.1.1 Logical standby database


Oracle Data Guard also offers a
logical standby database capability. With this capability, the
standard Oracle archive logs are transformed into SQL transactions,
and these are applied to an open standby database. The logical
standby database is different physically from the primary standby
database and can be used for different tasks. For example, the
primary database might be indexed for transaction processing while
the standby database might be indexed for data warehousing. Although
physically different from the primary database, the secondary
database is logically the same and can take over processing in case
the primary fails. As archive logs are shipped from the primary to
the secondary, undo records in the shipped archive log can be
compared to the logical standby undo records to guard against
potential corruption. As of Oracle Database 10g,
you can instantiate the logical standby database without quiescing
the primary.

10.5.1.2 Oracle Data Guard management


The Oracle
Data Guard broker provides monitoring and control for physical and
logical standby databases and components. A single command can be
used to perform failover. Oracle Enterprise Manager provides a Data
Guard Manager GUI for setting up, monitoring, and managing the
standby database.

The Oracle Database 10g Data Guard broker adds
support for creating and managing configurations containing RAC
primary and standby databases. The Data Guard broker leverages the
Cluster Ready Services in Oracle Database 10g.


10.5.2 Possible Causes of Lost Data with a Physical Standby Database


There is a
possibility that you will lose data, even if you use a physical
standby database. There are three possible causes of lost data in the
event of primary site failure:

Archived redo logs have not been shipped to the standby site

Filled online redo logs have not been archived yet

The current online redo log is not a candidate for archiving until a
log switch occurs


These three potential problems are addressed in different ways, as
described in the following sections.

10.5.2.1 Copying archived redo logs to a standby site


Prior to Oracle8i, copying of archived redo logs
from the primary to the standby site was not automated. You were free
to use any method to copy the files across the network. For example,
you could schedule a batch job that copies archived logs to the
standby site every N minutes. If the primary
site fails, these copies would limit the lost redo information (and
therefore the lost data) to a maximum of N
minutes of work.

Oracle8i first
provided support for the archiving of redo logs to a destination on
the primary server as well as on multiple remote servers. This
feature automates the copying and application of the archived redo
logs to one or more standby sites. The lost data is then limited to
the contents of any filled redo logs that have not been completely
archived, as well as the current online redo log. Oracle also
automatically applies the archived redo logs to the standby database
as they arrive.

Oracle9i added the option to specify zero data
loss to a standby machine. In this mode, all changes to a local log
file are written synchronously to a remote log file. This mode
guarantees that switching over to the standby database will not
result in any lost data. As you might guess, this mode may impact
performance, as each log write must also be completed to a remote log
file. Oracle provides an option that will only wait to write to a
remote log for a specified period of time, so that a network failure
will not bring database processing to a halt.

10.5.2.2 Unarchived redo information and the role of geomirroring


If you cannot allow primary site failure to result in the loss of
any data, and do not choose to use the zero data
loss option of Data Guard, the solution is to mirror all redo log and
control file activity from the primary site to the standby site.

You can provide this level of reliability by using a remote mirroring
technology sometimes known as
geomirroring.
Essentially, all writes to the online redo log files and the control
files at the primary site must be mirrored synchronously to the
standby site. For simplicity, you can also geomirror the archived log
destination, which will duplicate the archived logs at the remote
site, in effect copying the archived redo logs from the primary to
the standby site. This approach can simplify operations; you use one
solution for all the mirroring requirements, as opposed to having
Oracle copy the archived logs and having geomirroring handle the
other critical files.

Geomirroring of the online redo logs results in every committed
transaction being written to both the online redo log at the primary
site and the copy of the online redo log at the standby site. This
process adds some time to each transaction for the mirrored write to
reach the standby site. Depending on the distance between the sites
and the network used, geomirroring can hamper performance, so you
should test its impact on the normal operation of your database.

Geomirroring provides the most complete protection against primary
site failure and, accordingly, it's a relatively
expensive solution. You will need to weigh the cost of the
sophisticated disk subsystems and high-speed telecommunication lines
needed for nonintrusive geomirroring against the cost of losing the
data in any unarchived redo logs and the current online redo log. See
Appendix B for where to find more information about geomirroring.


/ 167