Tuesday, November 4, 2014

Process GAMS code

For a documentation tool, I investigated how we can use compiled GAMS code to understand data manipulation statements and model equations. I have seen attempts in the past where GAMS source was parsed directly. This is is a very difficult task. The GAMS language has many esoteric extensions and maintaining full compatibility is almost impossible. Indeed in the examples I looked at even small standard models from the model library were rejected.

So instead of parsing GAMS models directly, I work with the output of the GAMS compiler. This is a byte-code type of output, that is not very suited for further processing (other than execution). So we take this linear code and make a tree out of it. Walking the tree and generating all kind of interesting output is then relatively straightforward. Here we produce LaTeX output. E.g.

c(i) = sum(j, f(j)*d(i,j))/100;

causes the GAMS compiler to generate:

INSTRUCTION DUMP FROM 1 TO 401

  LOC    NUM INSTRUCT   SUB        FLD IDENT

  114      0 UnitBeg       0         17
  115     14 DefBeg        0        137 c
  116      7 CntrBeg       0        133 i
  117     12 Index         0          1
  118     55 EndLhs        0          0
  119     29 SumBeg        0          0
  120      7 CntrBeg       0        134 j
  121     12 Index         0          2
  122      2 Push          0        139 f
  123     12 Index         0          1
  124     12 Index         0          2
  125      2 Push          0        138 d
  126      4 MultOp        0          0
  127     30 SumOp         0          0
  128      8 CntrEnd       0          0
  129     21 Immed         0          5   1.00000000000000E+002
  130      4 MultOp        1          0
  131     15 DefOp         0        137 c
  132      8 CntrEnd       0          1
  133     16 DefEnd        0        137 c
  134      1 UnitEnd       0          0

A tree version of this can look like:

image

Finally some output:

image

A more elaborate equation:

armington(i)..
  x(i) =e= ac(i)*(delta(i)*m(i)**(-rho(i)) + (1-delta(i))*xxd(i)**(-rho(i)))**(-1/rho(i)) ;

This gives a large tree:

image

If we also use colors to indicate if a symbol is a variable (red) or a parameter (blue) we can generate:

image