Laptop Saved! (was RAM Mapping Script)

Jim Kuzdrall gnhlug at intrel.com
Thu Mar 6 09:31:15 EST 2008


On Wednesday 05 March 2008 21:38, Bill McGonigle wrote:
> Nice detective work, Jim!
>
> On Mar 5, 2008, at 16:28, Jim Kuzdrall wrote:
> >    The drive tested the same at 20C with either the -c or -c -c
> > options.
>
> Did you happen to interrogate the S.M.A.R.T status during this
> process?  Usually when a drive fails -c, there's also a S.M.A.R.T.
> error code.

    Tried.  Drive was too old to support SMART.
>
> >     So it appeared to be a hangup in the drive's internal
> > controller when it has a long series of read/writes to do.  The
> > "locate" database search may have triggered it in situ.
>
> You can probably convince the kernel to throw smaller command queues
> at it.  I've had to do this for a Dell PERC 3di RAID controller and a
> few bad NCQ implementations on SATA II drives, esp. under Linux 2.6
> which appears to be more efficient than 2.4 (or Windows) in this
> regard, so some controller firmwares get confused (race conditions,
> small buffers, I dunno).

    Yes, I have had similar things happen in my designs.  The data rate 
starts to wrap the buffer.  The receiver notifies the sender, but data 
is lost before the sender responds.  Usually, the sender has filled a 
buffer that is completing its transmission under hardware control.  
Many of the classic "RS-232" controllers responded to Clear-to-Send 
(CTS line) in two bytes maximum, but the ATA interface may have lost 
that feature.
>
> > I doubt I will ever know and no
> > longer care.
>
> Aww, c'mon, push it over 60 hours, and go for the 'really
> environmental' status. :) 

    Actually, I do have some curiosity left.  After disassembling the 
drive, I want to see if any common household chemicals can etch off the 
platter's magnetic coating.

Jim Kuzdrall


More information about the gnhlug-discuss mailing list