Yet Another Math Programming Consultant: gsl vs lapack performance

Thursday, June 25, 2009

gsl vs lapack performance

I had some doubts about the LU routines in the gsl library (GNU Scientific Library). See http://yetanothermathprogrammingconsultant.blogspot.com/2009/06/gsl-gnu-scientific-library.html. Here I try a quick experiment by inverting a square nxn matrix. As test matrix I used the Pei matrix (http://portal.acm.org/citation.cfm?id=368975). Here are the results:

library	gsl	lapack
routine	gsl_linalg_LU_decomp + gsl_linalg_LU_invert	dgesv
compiler	cygwin gcc	Lahey lf95
n=100	0.047 seconds	0.024 seconds
n=1000	6.735 seconds	2.017 seconds
n=2000	1:01 minutes	17.257 seconds

Indeed it looks that gsl is not quite competitive with lapack for this operation (although much better than figures I displayed earlier – these were wrong). I suspect this is due to how the gsl routines access many of the matrix elements: through an inlined function with the complicated expression m->data[i * m->tda + j]. I think the fast C way of doing this is to use pointers and pointer arithmetic. Another issue is cache misses. One should run through the matrix so that memory accesses as as much localized as possible.

PS. Of course it is noted that in most cases you do not want to invert a matrix, but rather solve for a given rhs.

PS2. Sorry: previous lapack timings were wrong (the input matrix was by accident zero for off-diagonal elements, that makes the inversion very simple!).

3 comments:

AnonymousJune 25, 2009 at 10:25 AM
The expression m->data[i * m->tda + j]

does not look like a function call. It just seems to be
an array indexing operation. Why is this so much slower than the method lapack uses? What does lapack use anyway?
ReplyDelete
Replies
Erwin KalvelagenJune 25, 2009 at 11:32 AM
The inline function is gsl_matrix_get(). Lapack uses normal array access: a(i,j).
ReplyDelete
Replies
PeterJune 30, 2009 at 9:37 PM
If the function is truly inlined by the compiler (ie- it doesn't generate a new function call on the stack), then the code you quote IS just pointer arithmetic. The [] operator is syntactic sugar for pointer arithmetic.
Thanks for the posts, I wasn't aware of GSL. Looks like they are not using LAPACK but BLAS 1 and 2 instead, which might explain the perf difference you are seeing? http://www.gnu.org/software/gsl/manual/html_node/Linear-Algebra.html
ReplyDelete
Replies

Add comment