Software RAID issues (was Re: Suggestions solicited, server bring up)
Ben Scott
dragonhawk at gmail.com
Fri Nov 20 16:43:36 EST 2009
On Fri, Nov 20, 2009 at 2:42 PM, Bill McGonigle <bill at bfccomputing.com> wrote:
>> Software-based solutions -- which don't kick in
>> until the OS is running -- sometimes get caught up trying to boot from
>> a failed disk.
>
> "Please don't use RAID-5".
>A healthy, properly configured (and tested) RAID-1 will boot nicely.
It's not an issue with RAID 1 vs 5. The issue is that non-RAID
cards are not intelligent. Here's the scenario:
System powers on. Disks spin up, servo heads, come online. BIOS
sees both disks as reporting online. BIOS reads the MBR from disk 0,
finds a valid signature, tries to boot from it. The bootstrap in the
MBR proceeds to try and load additional stages. One of those includes
a bad block. Loader aborts with an I/O error. System sits there like
a dumb shit forever. Disk 1 is fine and would work, but the BIOS
doesn't know that.
An alternate scenario is the system just hangs or aborts trying to
read the MBR from disk 0.
Optionally, have a watchdog that reboots the system. System sits in
a boot loop forever.
If this hasn't happened to you yet, lucky you. I sincerely hope
your luck continues. My luck is not as good as yours.
With a hardware RAID controller, the first time disk 0 has a bad
block, the controller will fail that disk out of the RAID set, and use
disk 1 for everything. The BIOS is never even aware there is a
problem.
The disks are behaving; I don't want disk drives to self-destruct
and report "Not Ready" (and thus be totally unusable) just because
they have non-zero bad blocks.
You could argue that the alternate scenario above is the fault of
the BIOS or disk controller, that it should be able to recover from an
I/O error on disk 0 and move on and try disk 1. You're prolly right.
But this is the pee sea platform we're talking about here. I don't
really need to explain the environment to you, do I? ;-) And the
BIOS is still helpless once the MBR bootstrap takes over.
-- Ben
More information about the gnhlug-discuss
mailing list