disk scrubber for Linux?

Fri Dec 9 09:27:01 EST 2005

Ben Scott <dragonhawk at gmail.com> writes:

>   It certainly seems like it *could* be worthwhile, and I doubt it
> would hurt.  I do think John Abreau's point, that such files should
> get read anyway, and regularly, as part of backups, is a good one. 
> But I also envision situations that don't work that way -- in
> particular, quiescent data which is backed up once, and then excluded
> from regular backups.

Or, how about a hard-drive based archival system designed to *replace
and obviate* tape backups and slow WORM media?  A previous employer of
mine sells exactly this type of solution, built on top of Linux, using
commodity hard drives.

Despite the fact that the system is designed to replicate each piece
of data at least once and deposit it elsewhere on the system (where by
system, I mean a cluster of Linux-based servers), individual
hard-drive failures are extremely problematic in that if a single
drive fails, all the data on that drive now needs to be replicated
throughout the system again to ensure data integrity.

Should another drive subsequently fail during this replication,
there's a serious problem for the system, since the second failed
drive was originally a destination for data contained on the first
failed drive.  This system is (supposedly) designed (according to
marketing) to not require being backed up.  Most of the failures seen,
are very small portions of the system, usually one or two blocks which
"go bad" at some point, but are mapped to infrequently read files.

While it's all well and good to claim that backups are still
necessary, or the system should be designed better, yadda, yadda,
yadda, the reality is this:

 - the quality of IDE hard drives sucks compared SCSI
 - the cost SCSI quality is more than customers will pay
 - there is big demand for low-cost, disk-based archival systems
   scaling into the PETA-byte range

Given that the servers in these clusters are usually 1u, and have 4
drives in them, it's imperitive to maximize the storage space per
node.  Going to scsi automatically increases your cost ridiculously,
but more importantly, simultaneously descreases your maximum capacity
per node.

So, mod's request for something which "scrubs" disks in the background
and re-maps bad blocks on the fly, IMO, is not only a legitimate
request, but something sorely needed in the Linux space.
-- 

Seeya,
Paul