RAM Mapping Script
Jim Kuzdrall
gnhlug at intrel.com
Sun Mar 2 15:39:05 EST 2008
On Sunday 02 March 2008 13:28, Ben Scott wrote:
> On Sun, Mar 2, 2008 at 7:53 AM, Jim Kuzdrall <gnhlug at intrel.com>
wrote:
> > I have a Dell 2650 with one byte of bad RAM.
>
> I take it you mean a Dell Inspriron 2650, and not a Dell PowerEdge
> 2650.
Yup.
> If you've got a DIMM installed in addition to the RAM soldered to
> the main board, try removing the module and running the tests again.
> I've heard of hardware designs doing strange things, like presenting
> add-on memory first when installed. Perhaps that is myth (on x86, at
> least), but worth a shot if you're trying to revive old hardware.
Timely suggestion. I was just removing the last screws - you know,
the ones that you have to break the plastic to get out. Your
suggestion even comes before I forgot where all the screws I took out
came from.
Actually, remapping the add-on RAM to the low address makes a lot of
sense from the hardware design aspect. Add-on modules of several sizes
can be used, most of them larger than the on-board memory. With a
large add-on, the hardware must leave a memory hole or do hardware
address translation for all the high order address bits. In contrast,
moving the small memory to the next location above requires only a
simple statically programmed comparator to its enable line.
> (The nominally cost-effective thing to do is throw the computer out
> and buy another one, but I trust you are aware of that and have
> particular factors which are affecting the math.)
I let principle override practicality all too often. It irks me to
toss a beautiful piece of engineering because one square micron of
silicon leaks a little too much current for the refresh rate.
>
> > The error occurs at address 64h.
INT 19h stores the code when the numeric keypad is used with the alt
key. Since the numeric keypad is not even enabled on the laptop, it it
hard to guess why that interrupt would be accessed and how it would
affect the functioning of these programs. But...
> I'm not sure you can reserve that region to prevent the system from
> using it (or if the system will work properly if you do). On the
> other hand, once the kernel switches into protected mode, I think
> things get more complicated, so maybe it doesn't matter so much.
Real or unreal mode, the location of the interrupt jump table is
built into the processor, as has been pointed out. But your the module
remapping observation may be the real salvation here.
> My understanding is that it is possible to tell the kernel to
> consider certain memory regions as reserved, using the kernel boot
> parameter "memmap=". For you, I think it would be:
>
> memmap=64$1
That is sort of what I was looking for. The mmap() facility lets you
reserve the absolute memory location for a process you assign, but I
doubt you could have an access to that address divert to the process
before the microprocessor fetched the contents and jumped.
> The fact that a test works on one machine and not another isn't
> necessarily conclusive. I've encountered hardware that appears to
> have features incompatible with MemTest86, where some region of
> memory will always be identified as "bad", even though the memory is
> in fact fine. The fact that the fault is reported intermittently is
> a sign that it is an accurate diagnostic, though. Design features
> are usually consistent.
I only intended to show that the program did not report a memory
error at $64 every so often no matter what it was testing. The tests
admittedly do not prove that the Dell address is not good nor that the
t60 address is not actually bad anymore than we can be sure that
Microsoft won't recall Vista and substitute Ubuntu tomorow.
>
> Still, I would suggest obtaining the latest version of the Dell
> Diagnostics for your computer from the Dell website, and running
> that. (I think that generation of laptop will have them as images of
> a mutli-diskette boot set.) Sometimes the OEM diagnostics know
> things generic diagnostics don't. At the very least, it would be
> nice if both MemTest86 and Dell Diags identify the same spot as bad.
To see if your RAM removal suggestion works, I should probably use
the same program that reported the error the first time. Remember, the
error occurred only 5 times in 95 hours of testing.
Test program has repeated the test 15 times without an error so far.
It goes faster with 128MB rather than 383MB. I should have complete
250 tests by tomorrow morning.
Jim Kuzdrall
More information about the gnhlug-discuss
mailing list