6.2 RAID
Nobody
likes to lose data. And since disks eventually die, often with little
warning, it's wise to consider setting up a RAID
(Redundant Array of Inexpensive[1] Disks) array on your database servers to
prevent a disk failure from causing unplanned downtime and data loss.
But there are many different types of RAID to consider: RAID 0, 1,
0+1, 5, and 10. And what about hardware RAID versus software RAID?[1] The
"I" in RAID has meant, at various
times, either "Inexpensive" or
"Independent." It started out as
"Inexpensive," but started being
referred to as "Independent"
because drives weren't really all that inexpensive.
By the time people actually started using
"Independent," the price of disks
had plummeted and they really were
"Inexpensive." Murphy at
work.
From a performance standpoint, some options are better than others.
The faster ones will sacrifice something to gain that
performanceusually price or durability. In all cases, the more
disks you have, the better performance you'll get.
Let's consider the benefits and drawbacks of each
RAID option.[2][2] For a more complete treatment of this
topic, consult Derek Vadala's Managing
RAID on Linux published by
O'Reilly.
RAID 0
Of all the RAID types, RAID 0, or
striping,
offers the biggest performance improvement. Writes and reads are both
faster in RAID 0 than in any other configuration. Because there are
no spare or mirrored disks, it's inexpensive.
You're using every disk you pay for. But the
performance comes at a high price. There's no
redundancy at all. Losing a single disk means that your whole array
is dead.RAID 0 should be used only when you don't care about
data loss. For example, if you're building a cluster
of MySQL slaves, it's entirely reasonable to use
RAID 0. You'll reap all the performance benefits,
and if a server does die, you can always clone the data from one of
the other slaves.
RAID 1
Moving up the scale, RAID 1, or
mirroring, isn't as fast as
RAID 0, but it provides redundancy; you can lose a disk and keep on
running. The performance boost applies only to reads. Since all the
data is on every disk in the mirrored volume, the system may decide
to read data in parallel from the disks. The result is that in the
optimal case it can read the same amount of data in roughly half the
time.Write performance, however is only as good as a single disk. It can
even be half as good depending on whether the RAID controller
performs the writes in parallel or sequential order. Also, from a
price point of view, you're paying for twice as much
space as you're using. RAID 1 is a good choice when
you need redundancy but have space or budget for only two
diskssuch as in a 1-U rackmount case.
RAID 5
From a performance standpoint, RAID 5, which
is striping (RAID 0) with distributed parity
blocks, can be beneficial. There are two disks involved
in every operation, so it's not substantially faster
than RAID 1 until you have more than three disks total. Even then,
its other benefit, size, shines through. Using RAID 5, you can create
rather large volumes without spending a lot of cash because you
sacrifice only a single disk. By using more smaller disks, such as
eight 36-GB disks instead of four 72-GB disks, you increase the
number of spindles in the array and therefore boost seek performance
and throughput.RAID 5 is the most commonly used RAID implementation. When funds are
tight, and redundancy is clearly more important than performance,
it's the best compromise available.
RAID 10 (also known as RAID 1+0)
To get the best of both worlds (the
performance benefits of RAID 0 along with the redundancy of RAID 1),
you need to buy twice as many disks. RAID 10 is the only way to get
the highest performance on your database server without sacrificing
redundancy. If you have the budget to justify it, you
won't be disappointed.
JBOD
The configuration sometimes called "Just a Bunch of
Disks" (JBOD) provides no added performance or
redundancy. It's simply a combination of two or more
smaller disks to produce a single, larger virtual disk.
Table 6-1 summarizes various RAID features.
Level | Redundancy | Disks required | Faster reads | Faster writes |
---|---|---|---|---|
RAID 0 | No | N | Yes | Yes |
RAID 1 | Yes | 2[3] | Yes | No |
RAID 5 | Yes | N+1 | Yes | No |
RAID 10 | Yes | N*2 | Yes | Yes |
JBOD | No | N/A | No | No |
it's possible to use more than two. Doing so will
boost read performance but doesn't change write
performance.
6.2.1 Mix and Match
When deciding how to configure your
disks, consider the possibility of multiple RAID arrays. RAID
controllers aren't that expensive, so you might
benefit from using RAID 5 or RAID 10 for your databases and a
separate RAID 1 array for your transaction and replication logs. Some
multichannel controllers can manage multiple arrays, and some can
even bind several channel controllers together into a single
controller to support more disks.Doing this isolates most of the serial disk I/O from most of the
random, seek-intensive I/O. This is because transaction and
replication logs are usually large files that are read from and
written to in a serial manner, usually by a small number of threads.
So it's not necessary to have a lot of spindles
available to spread the seeks across. What's
important is having sufficient bandwidth, and virtually any modern
pair of disks can fill that role nicely. Meanwhile, the actual data
and indexes are being read from and written to by many threads
simultaneously in a fairly random manner. Having the extra spindles
associated with RAID 10 will boost performance. Or, if you simply
have too much data to fit on a single disk, RAID 5's
ability to create large volumes works to your advantage.
6.2.1.1 Sample configuration
To make this more concrete, let's see what such a
setup might look like with both InnoDB and MyISAM tables.
It's entirely possible to move most of the files
around and leave symlinks in the original locations (at least on
Unix-based systems), but that can be a bit messy, and
it's too easy to accidentally remove a symlink (or
accidentally back up symlinks instead of actual data!). Instead, you
can adjust the my.cnf file to put files where
they belong.Let's assume you have a RAID 1 volume on which the
following filesystems are mounted: /,
/usr, and swap. You also
have a RAID 5 (or RAID 10) filesystem mounted as
/data. On this particular server, MySQL was
installed from a binary tarball into
/usr/local/mysql, making
/usr/local/mysql/data the default data
directory.The goal is to keep the InnoDB logs and replication logs on the
RAID-1 volume, while moving everything else to
/data. These my.cnf entries
can accomplish that:
datadir = /data/myisam
log-bin = /usr/local/mysql/data/repl/bin-log
innodb_data_file_path = ibdata1:16386M;ibdata2:16385M
innodb_data_home_dir = /data/ibdata
innodb_log_group_home_dir = /usr/local/mysql/data/iblog
innodb_log_arch_dir = /usr/local/mysql/data/iblog
These entries provide two top-level directories in
/data for MySQL's data files:
ibdata for the InnoDB data and
myisam for the MyISAM files. All the logs remain
in or below /usr/local/mysql/data on the RAID 1
volume.
6.2.2 Hardware Versus Software
Some operating systems can perform
software RAID. Rather than buying a dedicated RAID
controller, the operating system's kernel splits the
I/O among multiple disks. Many users shy away from using these
features because they've long been considered slow
or buggy.In reality, software RAID is quite stable and performs rather well.
The performance differences between hardware and software RAID tend
not to be significant until they're under quite a
bit of load. For smaller and medium-sized workloads,
there's little discernible difference between them.
Yes, the server's CPU must do a bit more work when
using software RAID, but modern CPUs are so fast that the RAID
operations consume a small fraction of the available CPU time. And,
as we stressed earlier, the CPU is usually not the bottleneck in a
database server anyway.Even with software RAID, you can use multiple disk controllers to
achieve redundancy at the hardware level without actually paying for
a RAID controller. In fact, some would argue that having two non-RAID
controllers is better than a single RAID controller.
You'll have twice the available I/O bandwidth and
have eliminated a single point of failure if you use RAID 1 or 10
across them.Having said that, there is one thing that can be done with hardware
RAID that simply can't be done in software:
write caching. Many RAID controllers can
add battery-backed RAM that caches reads and writes. Since
there's a battery on the card, you
don't need to worry about lost writes even when the
power fails. If it does, the data stays in memory on the controller
until the machine is powered back up. Most hardware RAID controllers
can also read cache as well.
6.2.3 IDE or SCSI?
It's a perpetual
question: do you use IDE or SCSI disks for your server? A few years
ago, the answer was easy: SCSI. But the issue is further muddied by
the availability of faster IDE bus speeds and IDE RAID controllers
from 3Ware and other vendors. For our purposes, Serial-ATA is the
same as IDE.The traditional view is that SCSI is better than IDE in servers.
While many people dismiss this argument, there's
real merit to it when dealing with database servers. IDE disks handle
requests in a sequential manner. If the CPU asks the disk to read
four blocks from an inside track, followed by eight blocks from an
outside track, then two more blocks from an inside track, the disk
will do exactly what it's told; even if
it's not the most efficient way to read all that
data. SCSI disks have a feature known as Tagged Command Queuing
(TCQ). TCQ allows the CPU to send several read/write requests to the
disk at the same time. The disk controller then tries to find the
optimal read/write pattern to minimize seeks.IDE also suffers from scaling problems; you can't
use more than one drive per IDE channel without suffering a severe
performance hit. Because most motherboards offer only four IDE
channels at most, you're stuck with only four disks
unless you add an additional controller. Worse yet, IDE has rather
restrictive cable limits. With SCSI, you can typically add 7 or 14
disks before purchasing a new controller. Furthermore, the constant
downward price pressure on hard disks has affected SCSI as much as
IDE.On the other hand, SCSI disks still cost more than their IDE
counterparts. When you're considering four or more
disks, the price difference is significant enough that you might be
able to purchase IDE disks and be able to afford another controller,
possibly even an IDE RAID controller. Many MySQL users are quite
happy using 3Ware IDE RAID controllers with 4-12 disks on them. It
costs less than a SCSI option, and the performance is reasonably
close to that of a high-end SCSI RAID controller.
6.2.4 RAID on Slaves
As we
mentioned in the discussion of RAID 0, if you're
using replication to create a cluster of slaves for your application,
it's likely that you can save money on the slaves by
using a different form of RAID. That means using a higher-performance
configuration that doesn't provide redundancy (RAID
0), using fewer disks (RAID 5 instead of RAID 10), or using software
rather than hardware RAID, for example. If you have enough slaves,
you may not necessarily need the redundancy on the slaves. In the
event that one slave suffers the loss of a disk, you can always
synchronize it with another nearby slave to get it started
again.