Software RAID issues (was Re: Suggestions solicited, server bring up)

Mon Nov 23 09:25:58 EST 2009

On Sat, Nov 21, 2009 at 6:02 PM, Bill McGonigle <bill at bfccomputing.com>wrote:

> On 11/21/2009 11:40 AM, Bruce Labitt wrote:
> > Bill, why not RAID-5?  Isn't RAID-5 supposed to be ultra-reliable?  As
> > in hot swap disks?  Or does this just apply to software RAID-5...
> >
> > -Bruce
> > who knows very little about this RAID stuff...
>
> RAID-5 itself has a problem known as the "RAID-5 write hole" where data
> loss can be guaranteed in certain situations.  RAID-6 is a patch to
>

I think the RAID 5 write hole refers to the slowdown on writes with RAID 5.
In order to lose data, a 2nd drive needs to fail (as opposed to only 1 drive
on a RAID 0 or JBOD).

RAID 6 uses 2 parity drives.  Your data is ok until 3 drives fails.  I think
another reason to go with 6 vs 5 is reduced head thrashing when 1 drive is
dead.

> prevent this.
>
> But RAID-5/6 also come with complexity, and software is buggy.  The main
>

That's an implementation issue, not inherent to RAID 5 or 6.

> advantage of RAID-5 is getting more disk space usable per dollar.  But
>

Exactly.

> with cheap disk space under $80/TB RAID-1 (simple mirroring) gets you
> less complex reliability and better performance.  And for a boot disk,
> having only one surviving drive is sufficient to get your machine
> running again.
>

I think most software RAID only does mirrors for boot.  RAID 1, not 5.

RAID5 will have faster read performance then RAID 1 or a single disk.  It
might be faster for reads then RAID-0 (striping) also.

> ZFS's RAID-Z is probably the exception to the rule as everything is
> round-trip checksummed, but I still wouldn't use it for a boot disk
> since a boot disk set doesn't need to be big enough to justify any cost
> savings.
>

ZFS's RAIDZ is like RAID5 in allocation, but it doesn't have the write
performance penalty of RAID5.  ZFS also has RAIDZ2 like RAID6 and RAIDZ3
which has 3 parity disks.

The checksum is done for non RAID in ZFS as well.  It can't survive a disk
failure, but it can detect & recover from bad blocks.  Most other
filesystems will silently corrupt the data.  I think the only other
filesystems that checksum are NetApp's WAFL(?) and Linux's btfrs.

One advantage to ZFS on the boot disk is reallocating partitions on the
fly.  When Solaris does an upgrade, it creates a new partition for the
upgrade OS alongside the original.  When you boot, grub points to the new
partition.  If it works, you can deallocate the old partition back to the
pool.  If it doesn't, you reboot to the old partition and put back the new
one.

Obviously I'm a ZFS fanboy and I'm very disappointed it won't be adopted in
Linux due to its license.  It's in FreeBSD (and FreeNAS). btrfs looks like
it has some nice improvements so I'm hoping to see it succeed alongside ZFS.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.gnhlug.org/mailman/private/gnhlug-discuss/attachments/20091123/d6f9f961/attachment.html