## Monday, September 4, 2023

### Critiquing a GAMS Model

It is always interesting to read GAMS models written by someone else. There are probably three things one can observe:

• A nice formulation or concept that is useful to learn about.
• A bad implementation: something that really should not be done that way.
• A piece of code that is correct and defensible, but I would write it differently. This includes things like style, layout, formatting, etc.
My way of reading GAMS code is often to start editing and make it "my code". It is a bit slower process, but that comes with its advantages: better understanding of what is going on, and often cleaner code.

Here, I am looking at the model sambal.gms in the GAMS model library . It is a very small model, but I have many thoughts about it. The complete model is reproduced in Appendix 1. Let's walk through it.

The matrix balancing problem is to find a nearby matrix such that row- and column sums are obeyed. A relative quadratic objective is used to minimize the sum of the squared deviations between the original data (the priors) and the final matrix. Zeros in the matrix need to be maintained: they can't become nonzero. This is sometimes called sparsity preservation. Often, sign-preservation is another condition. That is not part of this model. Note that, in this model, not only the matrix is updated but also the row and column totals.

### Data

The data entry and data prep step in sambal.gms is:

 Set i 'accounts' / lab, h1, h2, p1, p2 /; Alias (i,j); Table xb(i,j) 'original estimate'         lab  h1   h2   p1  p2   lab        15    3  130  80   h1     na   h2     na   p1         15  130       20   p2         25   40   55    ; Parameter   tb(i)   'original totals'   / lab 220, (h1,h2) na, p1 190, p2 105 /   tw(i)   'wights for totals'   xw(i,j) 'weights for cells'; tw(i)   = 1;xw(i,j) = 1$xb(i,j);tw(i)$(mapval(tb(i))     = mapval(na)) = 0;xw(i,j)$(mapval(xb(i,j)) = mapval(na)) = 0; display tw, xw; The first thing to note is that the table has some special values: NA. I usually use capitals for this to make it stand out more. There are also some blank cells that are zero. In GAMS, sparse storage is used throughout. So the important rule to remember is "zero is the same as 'does not exist'". A cell with NA does exist (i.e., it is nonzero), but it has a special value. I assume xb means "x before". The zero values in the matrix have a special meaning. They need to be preserved, i.e., the solution values for these should also be zero. The table also has a rim with totals. In this model, they are specified separately. I would combine them into one. I.e.: Table xb(*,*) 'original estimate' lab h1 h2 p1 p2 total lab 15 3 130 80 220 h1 na na h2 na na p1 15 130 20 190 p2 25 40 55 105 total 220 na na 190 105 ; This conveys better the underlying problem: our matrix needs to obey some row- and column-sum constraints. It is always better to use domain checking (which is turned off by using *), so I would suggest: Set iext 'extended:accounts+totals' / lab, h1, h2, p1, p2, total / i(iext) 'accounts' / lab, h1, h2, p1, p2 / ; Table xb(iext,iext) 'original estimate' lab h1 h2 p1 p2 total lab 15 3 130 80 220 h1 na na h2 na na p1 15 130 20 190 p2 25 40 55 105 total 220 na na 190 105 ; The blanks in the table (zeros) have a special meaning: zeros have to be preserved. We can fix the corresponding variables to zero, or, as is done in this model, we can skip these empty cells. Opposed to this, the NA values are to be included in the add-up constraints but skipped in the objective as we can't calculate a deviation from this. The typo in the explanatory text indicates we really need an IDE with spell-checking. My GAMS code often contains many comments, as I need to pass on code to other people who are not necessarily GAMS experts or even very familiar with the underlying models. So, I tend to be more verbose. Spell-checking can help there. I don't agree with how the assignments to tw and xw are written. Let me explain why. The mapval(x) function is an exotic function that gives back a numerical value corresponding to the type of x: 0: is not a special value 4: is UNDF (undefined) 5: is NA (not available) 6: is INF ( ) 7: is -INF ( ) 8: is EPS >8: is an acronym I don't think I have ever used this function. There is almost never a good reason to use it in this way, as we can say: tw(i)$(tb(i)=na) = 0;