mismatch_cnt != 0, member content mismatch, but md says the mirror is good

Ben Scott dragonhawk at gmail.com
Tue Nov 3 19:46:16 EST 2009


On Tue, Nov 3, 2009 at 6:06 PM, Bill McGonigle <bill at bfccomputing.com> wrote:
> Grub shouldn't make /dev/md0 inconsistent ...

  It might, if you install it by booting from floppy and running setup
on the two disks independently, which is what I did.

  GRUB doesn't understand Linux RAID.  It just happens to be able to
boot a Linux simple mirror because a simple mirror is the same as a
single disk if you're only reading.  Which is what GRUB normally does
-- unless you're installing it.  :)

  I installed it this way for few reasons.  One is that the GRUB docs
say it's the preferred method, as there's no really reliable way to
map the kernel's idea of your disks to BIOS drive numbers.  Another is
that in the past, I've found the "grub-install" method didn't actually
work during a failure of the first disk in the mirror set, while
installing from floppy did work (you could boot from either drive).

  Maybe I relied upon bad/old information?

> Now, do we know what kind of ECC the drive does?  It sounds like a
> multi-bit error wasn't handled (or the ECC electronics are failing in an
> infuriating fashion).

  That's certainly possible, but lacking better data, I'm more
inclined to suspect the GRUB installer.

  The thing that really gets me: How can mismatch_cnt be non-zero,
while everything else is saying the mirror is good?  I would think
mismatched blocks pretty much defines the "out-of-sync mirror" failure
condition.

  Plus, how did md detect the mismatches in the first place?  Does the
md driver periodically run a compare on its own?  Does mdmonitor
periodically trigger one?

-- Ben



More information about the gnhlug-discuss mailing list