## Friday, June 25, 2010

### GAMS augmenting a set

Ina project we encountered the problem where we loaded a GDX file and subsequently wanted to augment one of the sets in that GDX file with some elements. The following trick will do this:

set i(*);
\$gdxin data.gdx
* i has now elements a,b

\$onmulti
set i /c/;
\$offmulti
* i has now elements a,b,c

parameter p(i) /

a 1

b 2

c 3
/;
display i,p;

In the GDX file the set i has elements a and b. In the GAMS code we add the element c. This is not very intuitive, but it works. The advantage of this method is that we do this at compile time, i.e. we can declare parameters and variables over the set i. Note that we cannot use DISPLAY to debug this:

set i(*);
\$gdxin data.gdx

display i;

\$onmulti
set i /c/;
\$offmulti

display i;

This code will show the same output for both display statements:

 ----      5 SET i  a,    b,    c ----     11 SET i  a,    b,    c

This is because the display statement is executed at execution time while the sets are formed at compile time.

## Monday, June 21, 2010

http://eusprig.org/ has some interesting info on spreadsheet usage. Especially: http://eusprig.org/horror-stories.htm

## Friday, June 18, 2010

### NA in GAMS

Q: Does the value of A make sense; why is NA in there? There is no NA in B

set i /i1*i10/;
parameter A(i) /i1=1, i2=NA, i3=1e-10/;
display A;
set tiny(i);

tiny(i)\$[  A(i)
and
(Abs(A(i)) < 1e-5)
and
(A(i) <>
NA)
] =
yes;
display tiny;

parameter B(i);
B(i)\$(A(i) <>
NA) = A(i);
display B;

The purpose is to find the small values and store their index in set tiny. The result is:

 ----      5 PARAMETER A  i1       1.000,    i2          NA,    i3 1.00000E-10 ----     12 SET tiny  i2,    i3 ----     16 PARAMETER B  i1       1.000,    i3 1.00000E-10

We see element i2 is present, although we included A(i)<>NA as condition. This is not completely intuitive, but for element i2, with A(‘i2’)=NA, the condition reads as:

tiny(‘i2’)\$(NA and NA and NO) =  yes;

Now the rule is that any operation on NA (including ‘and’) results in NA. So the whole condition becomes:

NA and NA and NO => NA

As \$NA evaluates to \$1, we actually get that element tiny(i) set to YES.

Working with special values requires special attention, especially in complex expressions. Usually it is best to split this in parts:

set tiny2(i);
tiny2(i)\$[A(i)
and (Abs(A(i)) < 1e-5)] = yes;
tiny2(i)\$(A(i)=
NA) = no;
display tiny2;

Now this set will only contain the element i3.

## Thursday, June 17, 2010

### sparse matmul revisited

In http://yetanothermathprogrammingconsultant.blogspot.com/2009/02/timing-of-sparse-matrix-multiplication.html we time a (sparse and dense) matrix multiplication of two matrices (parameters). In OPL this can look like:
 int N = 250; {int} I = asSet(1..N); float A[i in I][j in I] = (i==j)?1:0; float B[i in I][j in I] = (i==j)?1:0; //float A[i in I][j in I] = 1; //float B[i in I][j in I] = 1; float C[i in I][j in I] = sum(k in I) A[i,k]*B[k,j]; float s = sum(i in I, j in I) C[i,j]; execute{   writeln(s); }
The timings are included in the following table:

 Model GNU Mathprog AMPL GAMS OPL sparse (identity) 50 sec 3 sec 0.1 sec 17 sec dense (all ones) 50 sec 3 sec 3 sec 17 sec
So in IBM OPL: no sparse processing, and a little bit slower than the other commercial modeling systems (but substantial faster than GNU's Mathprog).

## Wednesday, June 16, 2010

### ipopt for very small NLPs

In the paper http://www.palgrave-journals.com/jors/journal/v61/n3/full/jors200977a.html a small NLP model is developed to minimize fuel use for commercial ships by optimizing speed. The GAMS representation of the proposed model can look like:

\$ontext

minimize fuel use by optimizing speed on shipping routes

Reference:

Reducing fuel emissions by optimizing speed on

shipping routes, K Fagerholt, G Laporte and I Norstad,

Journal of the Operational Research Society (2010) 61, 523 --529

\$offtext

set
i
'segments' /

s1  Antwerp-Milford Haven

s2  Milford Haven - Boston

s3  Boston - Charleston

s4  Charleston - Algeciras

s5  Algeciras - Point Lisas

s6  Point Lisas - Houston

/
;

table data(i,*)

distance  early last
s1      510      1    5
s2     2699      9   13
s3      838     11   15
s4     3625     20   24
s5     3437     32   36
s6     2263     35   39

;

scalars
minspeed
'knots' /14/
maxspeed
'knots' /20/
;

parameter d(i) 'distance (nautical miles)';
d(i) = data(i,
'distance');

positive variables
v(i)
'speed (knots)'
t(i)
'time'
;
free variable fuel;

v.lo(i) = minspeed;
v.up(i) = maxspeed;

t.lo(i) = data(i,
'early')*24;
t.up(i) = data(i,
'last')*24;

*--------------------------------------------------------------------
* speed model
*--------------------------------------------------------------------

equations
obj1
time1(i)
;

obj1..  fuel =e=
sum(i, d(i)*(0.0036*sqr(v(i))-0.1015*v(i)+0.8848) );

time1(i).. t(i) - t(i-1) =g= d(i)/v(i);

model m1 /obj1,time1/;

*--------------------------------------------------------------------
* time model
*--------------------------------------------------------------------

positive variable
dt(i)
'sailing time'
;
* prevent division by zero by assuming lowerbound of 1
dt.lo(i) = 1;
equations
obj2
time2(i)
;

obj2..  fuel =e=
sum(i, d(i)*(0.0036*sqr(d(i)/dt(i))-0.1015*(d(i)/dt(i))+0.8848) );

time2(i).. t(i) - t(i-1) =g= dt(i);

model m2 /obj2,time2/;

solve m2 using nlp minimizing fuel;

The second model has only linear constraints so that may be preferable in general. The waiting time is handled implicitly in this model. I would probably make the model slightly larger by making this more explicit by introducing a positive variable wt(i):

time1(i).. t(i) - t(i-1) =e= d(i)/v(i) + wt(i);

The paper mentions that IPOPT solves these models in about .5 seconds, and suggest that a discretization yielding a shortest path problem is much faster. I would suggest to try also some other NLP solvers, as IPOPT is really geared towards large scale problems. Other solvers may be a lot faster than IPOPT on these small problems. Indeed with CONOPT I see:

 S O L V E      S U M M A R Y      MODEL   m1                  OBJECTIVE  fuel      TYPE    NLP                 DIRECTION  MINIMIZE      SOLVER  CONOPT              FROM LINE  65 **** SOLVER STATUS     1 Normal Completion         **** MODEL STATUS      2 Locally Optimal           **** OBJECTIVE VALUE             2266.4832 RESOURCE USAGE, LIMIT          0.000      1000.000 ITERATION COUNT, LIMIT        17    2000000000 EVALUATION ERRORS              0             0

The reported time is 0 seconds. (It is noted that the larger test problems in the paper are a little bit larger than this example, but not by much). Probably a good dense solver like DONLP will do very good on a model like this.

## Thursday, June 10, 2010

When doing real applications the demands on writing reports are sometime very high. Users have special requirements how the reports look like, and they often allow very little deviation from what they want. For this reason I often use GDXXRW with the CLEAR/MERGE option to update a table with solution data. This option will allow us to maintain a carefully designed layout.

The following was a little bit of a puzzle. How to update the spreadsheet below while skipping column E (units). The data is 4 dimensional: Commodity, Variable, Region, Year.

With GDXXRW we cannot use MERGE as that will keep old values in the spreadsheet. The option CLEAR is also not usable as it will wipe out column E. Eventually I found the following to work:
• Create an empty 5 dimensional parameter (rdim=4, cdim=1), so that we can CLEAR the body of the table
• Then use the original 4 dimensional parameter (rdim=3, cdim=1) with a MERGE option to fill the table body.
parameter reportx(*,*,*,*,*,t) 'dummy to be able to clear but keep column with units';
* index 1: commodity or '-'
* index 2: variable name
* index 3: region or '_'
* index 4: scenario
* index 5: unit or ''
* index 6: year

reportx(
'','','','','',t) = 0;

execute '=gdxxrw i=report.gdx o=report2.xlsx skipempty=10 par=reportx rng=A4 rdim=5 cdim=1 clear par=report rng=A4 rdim=4 cdim=1 merge trace=2';

When we use the CLEAR/MERGE option, all first rdim columns have to be unique. This is somewhat unfortunate. I want a header now and then, as in:

but I can only use “Production Detail” only once. (I have had similar issues before with things like a “Subtotal”). This case often works as expected. Not always, as can be seen here:

 C:\projects\china\dec24\output\chn10>gdxxrw i=report.gdx o=report2.xlsx skipempty=10 par=reportx rng=A4 rdim=5 cdim=1 clear par=report rng=A4 rdim=4 cdim=1 merge trace=2 GDXXRW           BETA  1May10 23.4.0 WIN 17193.17196 VS8 x86/MS Windows Excel version 12.0 Input file : C:\projects\china\dec24\output\chn10\report.gdx Output file: C:\projects\china\dec24\output\chn10\report2.xlsx Type Symbol       Dim     Sheet                Data          RowHeader     ColHeader Par  reportx        6     Sheet1               F5:XFD1000000 A5:E1000000   F4:XFD4 Reading range A5:E1000000 **** Duplicate Row/Column label(s) for symbol reportx:    Duplicate Row: A1062: (Yield, , , , ) Reading range F4:XFD4 Par  report         5     Sheet1               E5:XFD1000000 A5:D1000000   E4:XFD4 Reading range A5:D1000000    Duplicate Row: A1062: (Yield, , , ) Reading range E4:XFD4 Reading range E5:Y3767 Total time = 8065 Ms

May be it depends on whether some cells are empty or contain empty strings (the difference is not really visible in a spreadsheet).

However the same scheme for real rows would also be useful. E.g. in some cases I want data to be repeated, such as “YEAR”: in each row with such a key. just repeat the content.

Here I used the trick to introduce y2,y3,y5,… to make sure multiple year rows are being written. It would be better if GDXXRW would allow me to have multiple rows with row header “year”.

Update:

The “duplicate” errors are really warnings. The reason why some duplicate headers (“Production Detail”) don’t generate a message and others (“Yield”) do, depends on whether these descriptions are also used as UELS (set elements) in the GDX file.

## Sunday, June 6, 2010

### Constraint programming

One of the advantages of CP is that some problems can be formulated in a much more natural form. For instance for this problem: http://yetanothermathprogrammingconsultant.blogspot.com/2010/04/miqp-vs-dynamic-programming.html is it quite useful to use a variable as an index in an array.

 /*********************************************  * OPL 6.3 Model  * Author: erwin  * Creation Date: Jun 6, 2010 at 5:31:28 PM  *********************************************/  using CP;  int NC = 239; {int} Chapters = asSet(1..NC);   /* int D = 128; solve smaller version: */ int D=20; {int} Days = asSet(1..D);                      int Verses[Chapters] = [20,24,31,38,22,6,22,38,6,22,36,23,42,30,36,39,55,25,24,22,26,31,32,30,                         25,35,34,18,11,25,54,25,8,22,26,6,30,13,25,22,21,34,16,6,22,32,30,33,                         35,32,14,18,21,9,15,19,35,14,18,77,13,27,27,15,30,18,18,41,27,30,15,                         7,33,21,19,22,29,37,35,12,31,15,20,35,29,26,36,16,39,25,24,39,37,20,                         47,33,38,27,20,62,8,27,32,34,32,46,37,31,29,19,21,39,43,36,30,23,35,18,                         30,17,37,30,14,17,60,38,43,23,41,16,30,47,15,19,26,15,31,54,24,24,41,36,                         25,30,40,37,40,23,24,35,57,36,41,13,36,21,52,17,34,14,37,26,52,41,29,28,                         41,19,38,26,39,31,17,25,30,19,26,33,26,30,26,25,22,19,41,48,34,27,24,20,                         25,39,36,46,29,17,14,18,6,21,33,39,9,2,49,19,29,22,23,24,22,10,41,37,43,                         25,28,19,6,30,27,26,35,34,23,41,31,31,34,4,3,4,3,2,9,48,30,26,34];   dvar int startch[Days] in 1..NC; dvar int numch[Days] in 1..NC; dvar int+ v[Days]; float avg = (sum(c in Chapters) Verses[c])/D;   int verses[1..card(Chapters),1..card(Chapters)]; execute{    var i;    var j;    for(i=1; i<=NC; ++i) {       verses[i][1] = Verses[i];       for(j=2;i+j-1<=NC;++j)           verses[i][j] = verses[i][j-1]+Verses[i+j-1];    }         }   minimize   sum(d in Days) (v[d]-avg)*(v[d]-avg); subject to {   startch[1] == 1;   forall(d in 2..D) startch[d] == startch[d-1] + numch[d-1];   startch[D]+numch[D] == card(Chapters)+1;   forall(d in Days) v[d] == verses[startch[d],numch[d]]; }

Notes:

• This is written in IBM OPL
• GAMS has nothing to offer in this area
• AMPL has some facilities documented in some papers (http://users.iems.northwestern.edu/~4er/WRITINGS/cp.pdf) but I am unaware of this being actually available
• Some of the newer systems (Comet, MS Solver Foundation) have constraint programming support
• Some research codes out there have some support for a form of modeling language input (Minizinc, etc.).