Backing up a little - Trying to get LAPACK to work...

Bruce Labitt bruce.labitt at myfairpoint.net
Tue May 25 21:03:13 EDT 2010


Umm, my CLAPACK experiment is not doing so well.  (Reference Shot in the 
Dark Thread)  So I thought I'd try to interface to the "industry 
standard" LAPACK.  In the end, I expect to use CLAPACK, but I thought 
since MATLAB, GSL, SciPy, et. al use LAPACK, perhaps I could at least 
get some real work(TM) (coding) done.

Fundamentally, the LAPACK results are the same as in CLAPACK, I suppose 
that is good in a way.  I rewrote everything in C using the accumulated 
knowledge I've gained.  Nearly everything is on the heap.  mallocs and 
frees where they belong.  When the 2x2 example is run, it works.  
Valgrind declared no leaks, no problems. 

When the 9x9 example is run, it segfaults.  The program architecture is 
test1.c, svd.c.  test1 is "main".  svd.c is a wrapper function that 
actually calls the FORTRAN subroutine zgesvd_.  The segfault occurs when 
returning from svd.c, not returning from the FORTRAN subroutine.

Valgrind reports that the routine does not know where to return to, 
i.e., the return address is 0x?  From what I've been told this is 
indicative of a stack error (overrun).

If instead of using zgesvd_, I put in a dummy set of operation which 
actually write to all of the output matrices and then returns from 
svd.c, the program runs with no "error" for the 9x9 case.  I did this 
experiment to see if I was doing something wrong.

Next I tried compiling with the -fstack-protector-all switch.  If I 
removed the dummy operations (put back to "normal") and ran the 9x9 
case, zgesvd_ gave results (reported INFO=0) which indicated success.  
The svd.c routine returned to main (test1.c) and printed out an entirely 
optimistic success message ;).  However, on the next instruction, which 
accesses the output arrays, the system segfaulted with a similar 0x? 
error.  In other words the main program can no longer access the arrays 
which it had malloc'ed (and had not yet freed).

If I am interpreting this correctly then it seems there is a stack error 
of some sort in my compiled version of LAPACK.  Or? <smart people fill 
in the blank here, please!>

Does anyone have an idea?
One thing that I can try is to use the "reference" LAPACK in my system 
and link to it.  That way I can hopefully take out the effect of my build.
Any other suggestions?

Jeesh, this was supposed to be 'just' a port...  :-[

-Bruce


More information about the gnhlug-discuss mailing list