mismatch_cnt != 0, member content mismatch, but md says the mirror is good

Mon Feb 22 15:06:15 EST 2010

On Mon, Feb 22, 2010 at 10:06 AM, Michael ODonnell
<michael.odonnell at comcast.net> wrote:
> Anybody else running CentOS5.x ...

  liberty.gnhlug.org is running CentOS 5.4 with kernel
2.6.18-92.1.22.el5 (still haven't rebooted it).

liberty$ fgrep -i sync /var/log/kernel* | fgrep -i raid
/var/log/kernel:Feb 21 04:22:01 liberty kernel: md: syncing RAID array md0
/var/log/kernel.1:Feb 14 04:22:01 liberty kernel: md: syncing RAID array md0
/var/log/kernel.2:Feb  7 04:22:01 liberty kernel: md: syncing RAID array md0
/var/log/kernel.3:Jan 31 04:22:01 liberty kernel: md: syncing RAID array md0
/var/log/kernel.4:Jan 24 04:22:01 liberty kernel: md: syncing RAID array md0
liberty$

  It appears to be doing that at 4:22 AM (US Eastern) every Saturday,
for device md0.  4:22 AM is also the timestamp on those email messages
about mismatch_cnt I keep getting.

  Interestingly, the "md1" name does not appear in the log files.  (To
review: md0 is the /boot/ partition, which is small and quiet.  md1 is
the everything-else LVM PE partition, which hosts swap space and all
other filesystems.)

> It looks like the RAIDs on at least seven of our (mostly stock) CentOS5.4
> systems are routinely getting broken and going through a resync operation
> on a weekly basis at 4:22am which is when that /etc/cron.weekly script
> runs that's generating the mismatch_cnt warnings in question.

  I believe the script is </etc/cron.weekly/99-raid-check>.  It
apparently uses a config file </etc/sysconfig/raid-check>, which is
well-commented.

  The control flow of the script seems to be: The operation is only
run if the array is in a clean and idle state.  If the array is
degraded or rebuilding, the operation is skipped for that array.  The
default operation is "check", not "repair".

  The default operation can be either a "check" or "repair", as
specified by "CHECK="; on liberty, it is "check".  Both operation
types presumably scan all blocks on all members.  In the world of
RAID, this is often called "patrol read".  It's a good thing.  Disks
tend to be full of files which are rarely read.  If one of those files
develops a bad sector, the disk won't notice until you try to read it.
 Then when a different disk dies and the RAID subsystem tries to
rebuild from the remaining member(s), you discover your redundant
disks weren't as redundant as you would have liked.

  Assuming the comments in the config file are accurate, the "check"
operation will only attempt to repair "bad sectors".  Exactly how "bad
sectors" are detected isn't explained, but I presume it means "could
not read the block from one member device".  Exactly what "repair"
means isn't explained, but I presume it means "write the block from
the good device to the other device".  (This is a good thing -- any
hard disk manufactured within the past 20 years or so will remap a bad
block once it is written to.  And modern hard disks are virtually
guaranteed to have bad blocks.)

  Again assuming the comments are accurate, "check" will *not* attempt
to repair mismatches.  Mismatches are when all member devices could be
read, but the data is not consistent across devices.  This is what
"mismatch_cnt" reportedly reflects.  "repair", on the other hand, will
attempt to make the array consistent.  How the "repair" will choose
which data to keep is not explained, but the phase "luck of the draw"
is used.

  You can set "ENABLED=no" in the config file to disable the whole
thing, but before you do that, see above about "patrol read".  If you
think patrol read is a bad idea, you're probabbly wrong.

  So!  Having done what I should have done in the first place (RTFM
and RTFS), I know now why the problem was detected, and what my
options for remediation are.  That leaves "How did the mismatch occur
in the first place?" as my remaining question.

  Based on what I'm seeing (in particular, the mismatch *only* being
in the GRUB stage2 file), I'm going to conclude liberty's mismatch is
due to GRUB being installed on both physical hard disks independently
(booting from floppy).  Whether or not that's the right away to
install GRUB is an open question.

  Assuming that is a valid way to install GRUB, the system should be
fine, including for kernel updates, until and unless one mirror member
fails.  Any updates to kernel files will write to blocks other than
the GRUB stage2 file, and be properly mirrored.  But if a mirror
member dies, then once that bad disk is replaced, the system will copy
the good mirror to the new disk, including the "wrong copy" of GRUB.

-- Ben