Thursday, June 25, 2009

gsl vs lapack performance

I had some doubts about the LU routines in the gsl library (GNU Scientific Library). See Here I try a quick experiment by inverting a square nxn matrix. As test matrix I used the Pei matrix ( Here are the results:

library gsl lapack
routine gsl_linalg_LU_decomp +
compiler cygwin gcc Lahey lf95
n=100 0.047 seconds 0.024 seconds
n=1000 6.735 seconds 2.017 seconds
n=2000 1:01 minutes 17.257 seconds

Indeed it looks that gsl is not quite competitive with lapack for this operation (although much better than figures I displayed earlier – these were wrong). I suspect this is due to how the gsl routines access many of the matrix elements: through an inlined function with the complicated expression m->data[i * m->tda + j]. I think the fast C way of doing this is to use pointers and pointer arithmetic. Another issue is cache misses. One should run through the matrix so that memory accesses as as much localized as possible.

PS. Of course it is noted that in most cases you do not want to invert a matrix, but rather solve for a given rhs.

PS2. Sorry: previous lapack timings were wrong (the input matrix was by accident zero for off-diagonal elements, that makes the inversion very simple!).