On the Reliability of Automagic RAID

Bill McGonigle bill at bfccomputing.com
Thu Jan 12 09:22:00 EST 2006


So back in the day, you setup a /etc/raiddtab and RAID figured out how 
to bootstrap itself.  If it was there and the first disk wasn't dead 
you were golden, if it wasn't there you were in for a manual rebuild of 
said file and much pain.

So, with the advent of RAID headers on partitions and automagic RAID 
detection, the world was better.  Maybe...

I happened upon a system this morning, with a recent FC4 install, that 
wasn't so happy.  The machine had hung once (IDE controller failure) 
and was rebooted.  Everything was RAID 1 mirrored, originally:

   /dev/md0 - /dev/hda1, /dev/hdc1 - ext3 - /boot
   /dev/md1 - /dev/hda4, /dev/hdc4 - ext3 - /root
   /dev/md2 - /dev/hda2, /dev/hda2 - swap

So, first I tried the FC4 rescue disk.  I could have sworn a previous 
FC rescue disk autodetected mirrors and LVM but this one didn't.  
Anybody have a favorite rescue disc that _does_?

So, back into the booted system.  Today, it looks like:

   /dev/md0 - /dev/hda1, /dev/hda2, ext3 - /boot (CHECK)
   /dev/md1 - /dev/hda4 - ext3 - /root (UH, OH)
   /dev/md2 - /dev/hdc4 - ext3 - not mounted (DOUBLE UH, OH)
   /dev/hda2 - swap (DANG IT!)

So, it was really easy to hot-add the pieces back to the right raid 
sets and the machine wasn't into swap so it was easy to fix that too. 
But I feel like I shouldn't have been here in the first place and now 
feel less confident that the systems I've set up with RAID mirrors for 
ease of mind are actually in good health.  I'm thinking a Nagios probe 
might be required to keep me sleeping well.

Any thoughts?

-Bill
-----
Bill McGonigle, Owner           Work: 603.448.4440
BFC Computing, LLC              Home: 603.448.1668
bill at bfccomputing.com           Cell: 603.252.2606
http://www.bfccomputing.com/    Page: 603.442.1833
Blog: http://blog.bfccomputing.com/
VCard: http://bfccomputing.com/vcard/bill.vcf




More information about the gnhlug-discuss mailing list