Wednesday, March 14, 2012

GAMS GDX performance

Reading and writing GDX files is very fast in GAMS. However I encountered some relative slowdown in some cases. These all involved huge GDX files (something like 1.7 GB). With a small example I could reproduce this.

Model A set i /i1*i400/;
p(i,j,k) = uniform(0,1);

execute_unload "1.gdx"
With this model we can generate a large test file. The parameter p has 64 million entries. The file size is 0.6 GB.
Model B set i /i1/;
$gdxin 1.gdx
$load p
This model reads a slice of the data. It takes 13 seconds, which is longer than I expected. Note that I usually recommend to use $loaddc, but here the use of $load is intentional.
Model C set i /i1*i400/;
$gdxin 1.gdx
$load p
This model reads the whole thing in 6 seconds. Hence Model B should take less than 6 seconds.
Model D parameter p(*,*,*);
$gdxin 1.gdx
$load p
This is almost the same as Model C but needs 11 seconds. Should run as fast as Model C.

I expected Model C and D to perform the same and Model B to be significantly faster. For Model B the GDX routines will still do what is sometimes called a complete table scan, but as we can skip a lot of records this still should be faster than loading the whole thing.

I reported in the above table total times. In some cases we lose time somewhere in the model. E.g. if we run Model B with option PROFILE=1 we see:

   1  set i /i1/;
   2  parameter p(i,*,*);
GDXIN   C:\projects\tmp\load\1.gdx
--- LOAD  p = 4:p
----      4 $load                    6.724     6.724 SECS      8 Mb

COMPILATION TIME     =       13.588 SECONDS      8 Mb  WEX237-237 Aug 23, 2011

I suspect the $load timing is not correct as we don’t do anything else in the model.

Update: the reason for the slow $load performance of model B has been fixed in the GDXIO library. I assume this fix is too late for 23.8.