Friday, June 25, 2010

GAMS augmenting a set

Ina project we encountered the problem where we loaded a GDX file and subsequently wanted to augment one of the sets in that GDX file with some elements. The following trick will do this:

set i(*);
$gdxin data.gdx
$load i
* i has now elements a,b

$onmulti
set i /c/;
$offmulti
* i has now elements a,b,c

parameter p(i) /
 
a 1
 
b 2
 
c 3
/;
display i,p;

In the GDX file the set i has elements a and b. In the GAMS code we add the element c. This is not very intuitive, but it works. The advantage of this method is that we do this at compile time, i.e. we can declare parameters and variables over the set i. Note that we cannot use DISPLAY to debug this:

set i(*);
$gdxin data.gdx
$load i

display i;

$onmulti
set i /c/;
$offmulti

display i;

This code will show the same output for both display statements:

----      5 SET i 

a,    b,    c

----     11 SET i 

a,    b,    c

This is because the display statement is executed at execution time while the sets are formed at compile time.

Monday, June 21, 2010

Spreadsheets

http://eusprig.org/ has some interesting info on spreadsheet usage. Especially: http://eusprig.org/horror-stories.htm

Friday, June 18, 2010

NA in GAMS

Q: Does the value of A make sense; why is NA in there? There is no NA in B

set i /i1*i10/;
parameter A(i) /i1=1, i2=NA, i3=1e-10/;
display A;
set tiny(i);

tiny(i)$[  A(i)
and
          (Abs(A(i)) < 1e-5)
and
          (A(i) <>
NA)
       ] =
yes;
display tiny;

parameter B(i);
B(i)$(A(i) <>
NA) = A(i);
display B;

The purpose is to find the small values and store their index in set tiny. The result is:

----      5 PARAMETER A 

i1       1.000,    i2          NA,    i3 1.00000E-10

----     12 SET tiny 

i2,    i3

----     16 PARAMETER B 

i1       1.000,    i3 1.00000E-10

We see element i2 is present, although we included A(i)<>NA as condition. This is not completely intuitive, but for element i2, with A(‘i2’)=NA, the condition reads as:

tiny(‘i2’)$(NA and NA and NO) =  yes;

Now the rule is that any operation on NA (including ‘and’) results in NA. So the whole condition becomes:

NA and NA and NO => NA

As $NA evaluates to $1, we actually get that element tiny(i) set to YES.

Working with special values requires special attention, especially in complex expressions. Usually it is best to split this in parts:

set tiny2(i);
tiny2(i)$[A(i)
and (Abs(A(i)) < 1e-5)] = yes;
tiny2(i)$(A(i)=
NA) = no;
display tiny2;

Now this set will only contain the element i3.

Thursday, June 17, 2010

sparse matmul revisited

In http://yetanothermathprogrammingconsultant.blogspot.com/2009/02/timing-of-sparse-matrix-multiplication.html we time a (sparse and dense) matrix multiplication of two matrices (parameters). In OPL this can look like:
int N = 250;
{int} I = asSet(1..N);
float A[i in I][j in I] = (i==j)?1:0;
float B[i in I][j in I] = (i==j)?1:0;
//float A[i in I][j in I] = 1;
//float B[i in I][j in I] = 1;
float C[i in I][j in I] = sum(k in I) A[i,k]*B[k,j];
float s = sum(i in I, j in I) C[i,j];

execute{
  writeln(s);
}
The timings are included in the following table:

Model GNU Mathprog AMPL GAMS OPL
sparse (identity) 50 sec 3 sec 0.1 sec 17 sec
dense (all ones) 50 sec 3 sec 3 sec 17 sec
So in IBM OPL: no sparse processing, and a little bit slower than the other commercial modeling systems (but substantial faster than GNU's Mathprog).

Wednesday, June 16, 2010

ipopt for very small NLPs

In the paper http://www.palgrave-journals.com/jors/journal/v61/n3/full/jors200977a.html a small NLP model is developed to minimize fuel use for commercial ships by optimizing speed. The GAMS representation of the proposed model can look like:

$ontext

   
minimize fuel use by optimizing speed on shipping routes

   
Reference:
     
Reducing fuel emissions by optimizing speed on
     
shipping routes, K Fagerholt, G Laporte and I Norstad,
     
Journal of the Operational Research Society (2010) 61, 523 --529

$offtext

set
  i
'segments' /
       
s1  Antwerp-Milford Haven
       
s2  Milford Haven - Boston
       
s3  Boston - Charleston
       
s4  Charleston - Algeciras
       
s5  Algeciras - Point Lisas
       
s6  Point Lisas - Houston
     
/
;

table data(i,*)
    
distance  early last
s1      510      1    5
s2     2699      9   13
s3      838     11   15
s4     3625     20   24
s5     3437     32   36
s6     2263     35   39

;

scalars
   minspeed
'knots' /14/
   maxspeed
'knots' /20/
;

parameter d(i) 'distance (nautical miles)';
d(i) = data(i,
'distance');

positive variables
   v(i)
'speed (knots)'
   t(i)
'time'
;
free variable fuel;

v.lo(i) = minspeed;
v.up(i) = maxspeed;

t.lo(i) = data(i,
'early')*24;
t.up(i) = data(i,
'last')*24;

*--------------------------------------------------------------------
* speed model
*--------------------------------------------------------------------

equations
  obj1
  time1(i)
;

obj1..  fuel =e=
sum(i, d(i)*(0.0036*sqr(v(i))-0.1015*v(i)+0.8848) );

time1(i).. t(i) - t(i-1) =g= d(i)/v(i);

model m1 /obj1,time1/;

*--------------------------------------------------------------------
* time model
*--------------------------------------------------------------------


positive variable
  dt(i)
'sailing time'
;
* prevent division by zero by assuming lowerbound of 1
dt.lo(i) = 1;
equations
  obj2
  time2(i)
;

obj2..  fuel =e=
sum(i, d(i)*(0.0036*sqr(d(i)/dt(i))-0.1015*(d(i)/dt(i))+0.8848) );

time2(i).. t(i) - t(i-1) =g= dt(i);

model m2 /obj2,time2/;


solve m2 using nlp minimizing fuel;


The second model has only linear constraints so that may be preferable in general. The waiting time is handled implicitly in this model. I would probably make the model slightly larger by making this more explicit by introducing a positive variable wt(i):

time1(i).. t(i) - t(i-1) =e= d(i)/v(i) + wt(i);

The paper mentions that IPOPT solves these models in about .5 seconds, and suggest that a discretization yielding a shortest path problem is much faster. I would suggest to try also some other NLP solvers, as IPOPT is really geared towards large scale problems. Other solvers may be a lot faster than IPOPT on these small problems. Indeed with CONOPT I see:

               S O L V E      S U M M A R Y

     MODEL   m1                  OBJECTIVE  fuel
     TYPE    NLP                 DIRECTION  MINIMIZE
     SOLVER  CONOPT              FROM LINE  65

**** SOLVER STATUS     1 Normal Completion        
**** MODEL STATUS      2 Locally Optimal          
**** OBJECTIVE VALUE             2266.4832

RESOURCE USAGE, LIMIT          0.000      1000.000
ITERATION COUNT, LIMIT        17    2000000000
EVALUATION ERRORS              0             0

The reported time is 0 seconds. (It is noted that the larger test problems in the paper are a little bit larger than this example, but not by much). Probably a good dense solver like DONLP will do very good on a model like this.

Thursday, June 10, 2010

GAMS: writing spreadsheets

When doing real applications the demands on writing reports are sometime very high. Users have special requirements how the reports look like, and they often allow very little deviation from what they want. For this reason I often use GDXXRW with the CLEAR/MERGE option to update a table with solution data. This option will allow us to maintain a carefully designed layout.

The following was a little bit of a puzzle. How to update the spreadsheet below while skipping column E (units). The data is 4 dimensional: Commodity, Variable, Region, Year.


With GDXXRW we cannot use MERGE as that will keep old values in the spreadsheet. The option CLEAR is also not usable as it will wipe out column E. Eventually I found the following to work:
  • Create an empty 5 dimensional parameter (rdim=4, cdim=1), so that we can CLEAR the body of the table
  • Then use the original 4 dimensional parameter (rdim=3, cdim=1) with a MERGE option to fill the table body.
parameter reportx(*,*,*,*,*,t) 'dummy to be able to clear but keep column with units';
* index 1: commodity or '-'
* index 2: variable name
* index 3: region or '_'
* index 4: scenario
* index 5: unit or ''
* index 6: year

reportx(
'','','','','',t) = 0;

execute_unload "report.gdx",report,reportx;

execute '=gdxxrw i=report.gdx o=report2.xlsx skipempty=10 par=reportx rng=A4 rdim=5 cdim=1 clear par=report rng=A4 rdim=4 cdim=1 merge trace=2';

When we use the CLEAR/MERGE option, all first rdim columns have to be unique. This is somewhat unfortunate. I want a header now and then, as in:

image

but I can only use “Production Detail” only once. (I have had similar issues before with things like a “Subtotal”). This case often works as expected. Not always, as can be seen here:

C:\projects\china\dec24\output\chn10>gdxxrw i=report.gdx o=report2.xlsx skipempty=10 par=reportx rng=A4 rdim=5 cdim=1 clear par=report rng=A4 rdim=4 cdim=1 merge trace=2

GDXXRW           BETA  1May10 23.4.0 WIN 17193.17196 VS8 x86/MS Windows
Excel version 12.0
Input file : C:\projects\china\dec24\output\chn10\report.gdx
Output file: C:\projects\china\dec24\output\chn10\report2.xlsx
Type Symbol       Dim     Sheet                Data          RowHeader     ColHeader
Par  reportx        6     Sheet1               F5:XFD1000000 A5:E1000000   F4:XFD4
Reading range A5:E1000000
**** Duplicate Row/Column label(s) for symbol reportx:
   Duplicate Row: A1062: (Yield, , , , )
Reading range F4:XFD4
Par  report         5     Sheet1               E5:XFD1000000 A5:D1000000   E4:XFD4
Reading range A5:D1000000
   Duplicate Row: A1062: (Yield, , , )
Reading range E4:XFD4
Reading range E5:Y3767
Total time = 8065 Ms

May be it depends on whether some cells are empty or contain empty strings (the difference is not really visible in a spreadsheet).

However the same scheme for real rows would also be useful. E.g. in some cases I want data to be repeated, such as “YEAR”: in each row with such a key. just repeat the content.

image

Here I used the trick to introduce y2,y3,y5,… to make sure multiple year rows are being written. It would be better if GDXXRW would allow me to have multiple rows with row header “year”.

Update:

The “duplicate” errors are really warnings. The reason why some duplicate headers (“Production Detail”) don’t generate a message and others (“Yield”) do, depends on whether these descriptions are also used as UELS (set elements) in the GDX file.

Sunday, June 6, 2010

Constraint programming

One of the advantages of CP is that some problems can be formulated in a much more natural form. For instance for this problem: http://yetanothermathprogrammingconsultant.blogspot.com/2010/04/miqp-vs-dynamic-programming.html is it quite useful to use a variable as an index in an array.

/*********************************************

 * OPL 6.3 Model

 * Author: erwin

 * Creation Date: Jun 6, 2010 at 5:31:28 PM

 *********************************************/

 using CP;

 int NC = 239;

{int} Chapters = asSet(1..NC);

 

/* int D = 128; solve smaller version: */

int D=20;

{int} Days = asSet(1..D);

                    

int Verses[Chapters] = [20,24,31,38,22,6,22,38,6,22,36,23,42,30,36,39,55,25,24,22,26,31,32,30,

                        25,35,34,18,11,25,54,25,8,22,26,6,30,13,25,22,21,34,16,6,22,32,30,33,

                        35,32,14,18,21,9,15,19,35,14,18,77,13,27,27,15,30,18,18,41,27,30,15,

                        7,33,21,19,22,29,37,35,12,31,15,20,35,29,26,36,16,39,25,24,39,37,20,

                        47,33,38,27,20,62,8,27,32,34,32,46,37,31,29,19,21,39,43,36,30,23,35,18,

                        30,17,37,30,14,17,60,38,43,23,41,16,30,47,15,19,26,15,31,54,24,24,41,36,

                        25,30,40,37,40,23,24,35,57,36,41,13,36,21,52,17,34,14,37,26,52,41,29,28,

                        41,19,38,26,39,31,17,25,30,19,26,33,26,30,26,25,22,19,41,48,34,27,24,20,

                        25,39,36,46,29,17,14,18,6,21,33,39,9,2,49,19,29,22,23,24,22,10,41,37,43,

                        25,28,19,6,30,27,26,35,34,23,41,31,31,34,4,3,4,3,2,9,48,30,26,34];

 

dvar int startch[Days] in 1..NC;

dvar int numch[Days] in 1..NC;

dvar int+ v[Days];

float avg = (sum(c in Chapters) Verses[c])/D;

 

int verses[1..card(Chapters),1..card(Chapters)];

execute{

   var i;

   var j;

   for(i=1; i<=NC; ++i) {

      verses[i][1] = Verses[i];

      for(j=2;i+j-1<=NC;++j)

          verses[i][j] = verses[i][j-1]+Verses[i+j-1];

   }        

}

 

minimize

  sum(d in Days) (v[d]-avg)*(v[d]-avg);

subject to {

  startch[1] == 1;

  forall(d in 2..D) startch[d] == startch[d-1] + numch[d-1];

  startch[D]+numch[D] == card(Chapters)+1;

  forall(d in Days) v[d] == verses[startch[d],numch[d]];

}

Notes:

  • This is written in IBM OPL
  • GAMS has nothing to offer in this area
  • AMPL has some facilities documented in some papers (http://users.iems.northwestern.edu/~4er/WRITINGS/cp.pdf) but I am unaware of this being actually available
  • Some of the newer systems (Comet, MS Solver Foundation) have constraint programming support
  • Some research codes out there have some support for a form of modeling language input (Minizinc, etc.).