On the Reliability of Automagic RAID

Bill McGonigle bill at bfccomputing.com
Thu Jan 12 14:08:02 EST 2006


On Jan 12, 2006, at 09:50, Ben Scott wrote:

>   Was the RAID subsystem showing the md1 virtual device as a RAID-1
> mirror with a failed member?  If so, check the logs.

Ah, very interesting - good raid logs these days - each reboot has good 
success until the most recent one where it says, 'hda4 and hdc4 have 
the same UUID by different superblocks.'  I wonder how _that_ happened.

It might have something to do with the ide controller frying but that 
was a different controller.  Maybe the ide subsystem went to hell when 
that happened.  Still not the ideal behavior - linux though the 
controller 'went offline' (which it sure as heck did).

>>    /dev/md2 - /dev/hdc4 - ext3 - not mounted (DOUBLE UH, OH)
>
>   What was the RAID subsystem saying for the status of the md2 virtual
> device?  Did the logs give any clue as to why the filesystem wasn't
> mounted?

This _used_ to be the second member of the md1 mirror.  When it saw the 
UUID match and superblock difference it said, "ah, that must be a 
separate RAID set" and made a new md device for me.  That's awfully 
nice of it but not the right failure mode, I think.  I'd rather it say, 
"hey - your RAID set is screwed up Mr." on a UUID match but other 
failure.

> I suspect what happened here is that the auto-detect magic
> saw hda2 as a valid swap device, and mounted it.  Then the RAID
> subsystem saw hda2 as in use, and didn't activate the RAID member.

So what seems to have happened here is when it made me a new /dev/md2 
based on the above, it then went and tried to add the swap partitions 
into /dev/md2.  (I don't know how it remembered from the last boot that 
/dev/md2 was swap - that's important to know).  It then said, "hey, md2 
already exists, I'm bailing," and then later, something, swapon -a or 
hald - not sure yet, turned on swap.

So, I've hardcoded /dev/md2 in fstab as swap but that's not quite right 
either - something might change the md* order on a new failure and then 
md2 won't be md2 again.  Maybe it's better to leave it out of fstab and 
let swapon handle it.

-Bill
-----
Bill McGonigle, Owner           Work: 603.448.4440
BFC Computing, LLC              Home: 603.448.1668
bill at bfccomputing.com           Cell: 603.252.2606
http://www.bfccomputing.com/    Page: 603.442.1833
Blog: http://blog.bfccomputing.com/
VCard: http://bfccomputing.com/vcard/bill.vcf




More information about the gnhlug-discuss mailing list