Saturday, June 13, 2009

Generated models

When programs generate models in MathProg/AMPL they tend to have a different structure. In this case a scalar model was generated with some very long lines. The user reports some very large memory usage:

Model has been successfully generated
glp_simplex: original LP has 93 rows, 5256 columns, 84143 non-zeros
glp_simplex: presolved LP has 82 rows, 5256 columns, 77163 non-zeros
Scaling...
A: min|aij| =  1.000e+00  max|aij| =  1.600e+02  ratio =  1.600e+02
GM: min|aij| =  4.952e-01  max|aij| =  2.019e+00  ratio =  4.077e+00
EQ: min|aij| =  2.471e-01  max|aij| =  1.000e+00  ratio =  4.048e+00
Crashing...
Size of triangular part = 82
      0: obj =   0.000000000e+00  infeas =  5.657e+00 (0)
*    50: obj =   1.410800000e+00  infeas =  0.000e+00 (0)
*    80: obj =  -7.209526707e-32  infeas =  2.776e-17 (0)
OPTIMAL SOLUTION FOUND
Time used:   0.1 secs
Memory used: 2008.7 Mb (2106232859 bytes)

(With some versions it may even stop with a stack overflow). Indeed this is due to the modeling language part and not the solver per se. When we run:

$ glpsol --math big.mod --wcpxlp big.lp
$ glpsol --cpxlp big.lp

the second invocation will just use a fraction of the memory. Clearly glpk has troubles with very long scalar expressions.

Actually AMPL is not doing much better:

big.mod, line 7005 (offset 158374):
        HES03 is not defined

Looks like we hit a max line length limit, and a variable name is truncated leading to this error. A better message would be “line is too long”. Even better is not to have artificial limits on input lines.

I remember reports about similar issues with early versions of the GCC compiler: it had problems with machine generated C code where expressions were spanning several pages. This was very different from how a normal programmer would write code.