raid1 woes

Lloyd Kvam lkvam at venix.com
Tue Aug 29 00:30:29 EDT 2017


The problem server is getting used for testing and was working well enough that I
kept it available for use until dinner time.

This evening I rebooted using a Fedora 24 (yes, I need to upgrade) USB flash drive.
The mdadm commands all worked normally. The recovery (resync) operations ran through
to completion. The drives and server appear to be back to normal. It is back in
service.

So I hope nobody lost sleep worrying about this. I suppose the lesson for me is that
booting off alternate media makes it easy to work on the drives and this applies to
RAID as well.

On Mon, 2017-08-28 at 14:44 -0400, Lloyd Kvam wrote:
> I have a raid1 where the resync operation stalls. It is set up to mirror three
> drives
> with a fourth as a spare. This set up has been working nicely for several year,
> until
> yesterday. I will rotate a drive to off-site storage to protect the data relying on
> the raid to maintain 3 live disks.
> 
> From /proc/mdstat
> md1 : active raid1 sdc2[6](S) sda2[5] sdb2[4] sdd2[3](F)
>       32766904 blocks super 1.2 [3/2] [UU_]
>       [======>..............]  recovery = 30.2% (9924224/32766904) finish=5241.7min
> speed=72K/sec
> 
> This looks normal, but it will NOT advance past that block. It shows two live disks
> in the array and a resync to bring the 3rd drive current. Unfortunately that 3rd
> drive will not finish the recovery.
> 
> This happened yesterday to the original spare when I went to rotate a drive out for
> off-site storage. The original third drive also stalled when added back to the
> array.
> Today I tried to add a new drive, but am hitting the same problem.
> 
> My best guess is that one or both of the live drives is failing. What should I do
> next? My plan is to use ddrescue to copy the two live disks to new drives and then
> try to start the raid array with the new drives. I am hoping that will allow me to
> resync a third drive. Does this make sense?
> 
> Is there a better approach?
> 
> https://raid.wiki.kernel.org/index.php/Linux_Raid
> This seems to be (by far) the best site for background, but they do not address a
> stalled recovery in any page that I have found.
> 
> Thanks for any advice or suggestions.
> 
> 
-- 
Lloyd Kvam
5 Foliage View
Lebanon, NH 03766
802-448-0836





More information about the gnhlug-discuss mailing list