GRUB & RAID have me Stumped

Ben Scott dragonhawk at gmail.com
Fri Jun 16 12:50:00 EDT 2006


On 6/16/06, Paul Lussier <p.lussier at comcast.net> wrote:
> I have set up a 4-way mirror of my OS partions.

  Four-way?  Are you really, really afraid of data loss or something?  ;-)

>  4. Reboot, telling grub that / is now on /dev/sdb1.

  I suspect the trouble is around here.

  AFAIK, GRUB doesn't understand Linux RAID (md driver).  However,
Linux RAID mirroring results in the underlying physical partitions
being exact copies of the RAID device.  So what we do is put our boot
files on a Linux RAID mirror, and tell GRUB to use one of the
underlying devices as the GRUB root partition.  GRUB just thinks it's
accessing a regular, non-RAID filesystem.  Since GRUB never writes to
the filesystem, this is fine.

  However, the kernel *does* write to the filesystem (of course).  So
you have to tell the kernel to use the RAID device, not the physical
device.  In other words, while your GRUB config file might say "root
(hd0,0)", your kernel command line will say "root=/dev/md0".

  If you tell the kernel to use "/dev/hda1" instead of "/dev/md0", the
kernel will mount the underlying physical device as a filesystem,
while at the same time trying to implement RAID mirroring on top of
the same filesystem.  I'm not sure exactly what happens if you do
that, but I'm sure it won't be good.  I suspect your troubles with
perpetual re-mirroring and odd datestamps stem from this.

>   # mount /dev/sda1 /mnt

  Never ever do that.

  You've just told the kernel to mount a filesystem (read-write) which
is actually a component of a RAID mirror.  So the kernel will be
seeing two slightly different filesystems, and updating them
independently, but on the same device!  I'm somewhat amazed you
haven't trashed the filesystem completely.

  Never attempt to mount an underlying physical device as a filesystem
in read/write mode.  If you do, the underlying device will be written
to without the other RAID members being updated or the RAID driver
being aware.  At a minimum, your RAID set will be inconsistent.

  Never attempt to mount an underlying physical device if the RAID set
is in read/write mode or resync'ing.  This is because the different
underlying physical devices are not guaranteed to be consistent at any
given time.  The filesystem metadata may not be sane, which will cause
the filesystem driver to puke if it reads something at the wrong time.

  If a RAID device is read-only and consistent, you can mount the
underlying physical device read-only.  This is the only time this is
safe.  Everything is read-only, so nothing is changing, so the
underlying device matches the RAID presentation.

RECOVERY

  If you can, just wipe everything and reinstall from scratch.  It'll be easier.

  Boot a rescue CD that includes RAID support, activate the RAID sets,
and see if they sync.  If so, run an fsck against the RAID device
(/dev/md?).  If that passes, things are probably consistent.  If not,
see above about reinstall.

-- Ben



More information about the gnhlug-discuss mailing list