Thursday, February 26, 2009

Finding all roots with GAMS

> How do I find all roots of x^3-x^2-x=0 with GAMS.

GAMS is more geared towards solving large-scale nonlinear programming problems, where just one, hopefully optimal, solution is reported. The typical NLP solvers under GAMS support this single solution paradigm (they are using numerical optimization algorithms). A symbolic math package such as Mathematica would be ideally suited for a problem of finding all roots of a polynomial.


Update: See comment for a good solution using GAMS/BARON.

Tuesday, February 24, 2009

Timing of Sparse Matrix Multiplication in Mathprog, Ampl and GAMS

> In Mathprog, can I use sparse matrix multplication?
In GAMS parameters are stored sparse and also the operations are executed sparse automatically. This can make a large difference in performance (cpu time and memory usage). Below I will investigate how GAMS, AMPL and Mathprog behave on a sparse matrix multiplication.
Here is a small example of multiplying two identity matrices in AMPL or Mathprog:
param N := 250;
set I := {1..N};
param A{i in I, j in I} := if (i=j) then 1;
param B{i in I, j in I} := if (i=j) then 1;
param C{i in I, j in I} := sum{k in I} A[i,k]*B[k,j];
param s := sum{i in I, j in I} C[i,j];
display s;
end;

Using the timeit.exe tool from the windows 2003 resource kit, we see:
C:\projects\glpk\glpk\bin>timeit glpsol --math x.mod
Reading model section from x.mod...
9 lines were read
Display statement at line 7
s = 250
Model has been successfully generated
Problem has no rows

Version Number: Windows NT 6.0 (Build 6001)
Exit Time: 4:28 pm, Tuesday, February 24 2009
Elapsed Time: 0:00:51.025
Process Time: 0:00:50.076
System Calls: 3306455
Context Switches: 841545
Page Faults: 191189
Bytes Read: 36549676
Bytes Written: 1409054
Bytes Other: 120151

C:\projects\glpk\glpk\bin>timeit \projects\ampl\amplcml\ampl.exe x.mod
s = 250


Version Number: Windows NT 6.0 (Build 6001)
Exit Time: 4:28 pm, Tuesday, February 24 2009
Elapsed Time: 0:00:02.571
Process Time: 0:00:02.542
System Calls: 104208
Context Switches: 20998
Page Faults: 9767
Bytes Read: 59854
Bytes Written: 784
Bytes Other: 6240

C:\projects\glpk\glpk\bin>

I.e. Mathprog takes 50 seconds and AMPL only 3 seconds. This difference is largely due to smarter implementation in AMPL, but likely not to better sparse execution. If we fill the matrices A and B with all ones we get similar timings (50 vs 3 seconds).
When we translate this to GAMS:
set i /1*250/;
alias (i,j,k);
parameter A(i,j),B(i,j),C(i,j);
A(i,i) = 1;
B(i,i) = 1;
C(i,j) = sum(k, A(i,k)*B(k,j));
scalar s;
s = sum((i,j),C(i,j));
display s;

we get even better performance:
C:\projects\glpk\glpk\bin>timeit gams.exe x.gms
--- Job x.gms Start 02/24/09 16:33:00 WEX-WEI 23.0.2 x86_64/MS Windows
GAMS Rev 230 Copyright (C) 1987-2009 GAMS Development. All rights reserved
Licensee: Erwin Kalvelagen G090213/0001CV-WIN
Amsterdam Optimization Modeling Group DC4572
--- Starting compilation
--- x.gms(9) 3 Mb
--- Starting execution: elapsed 0:00:00.008
--- x.gms(9) 4 Mb
*** Status: Normal completion
--- Job x.gms Stop 02/24/09 16:33:00 elapsed 0:00:00.031

Version Number: Windows NT 6.0 (Build 6001)
Exit Time: 4:33 pm, Tuesday, February 24 2009
Elapsed Time: 0:00:00.109
Process Time: 0:00:00.015
System Calls: 7493
Context Switches: 1558
Page Faults: 3959
Bytes Read: 495203
Bytes Written: 418566
Bytes Other: 702554

C:\projects\glpk\glpk\bin>


I.e. only 0.1 seconds. This is indeed due to sparse processing of the matrix multiplication. If we fill A and B with ones then the GAMS timing becomes close to AMPL: about 3 seconds. Indeed when we look at the GAMS profiling information we see how sparsity is exploited in the different statements:
**** SPARSE MODEL (Unity matrix)
---- 1 ExecInit 0.000 0.000 SECS 3 Mb
---- 4 Assignment A 0.000 0.000 SECS 4 Mb 250
---- 5 Assignment B 0.000 0.000 SECS 4 Mb 250
---- 6 Assignment C 0.016 0.016 SECS 4 Mb 250
---- 8 Assignment s 0.000 0.016 SECS 4 Mb

---- 9 PARAMETER s = 250.000
---- 9 Display 0.000 0.016 SECS 4 Mb

**** DENSE MODEL (Matrix with ones)

---- 1 ExecInit 0.000 0.000 SECS 3 Mb
---- 4 Assignment A 0.000 0.000 SECS 6 Mb 62500
---- 5 Assignment B 0.015 0.015 SECS 8 Mb 62500
---- 6 Assignment C 2.746 2.761 SECS 13 Mb 62500
---- 8 Assignment s 0.000 2.761 SECS 13 Mb

---- 9 PARAMETER s = 1.562500E+7
---- 9 Display 0.000 2.761 SECS 13 Mb

All the action is in the assignment to C and this is where GAMS really shows it strength. To summarize, the timings are roughly:

Model Mathprog AMPL GAMS
sparse (identity) 50 sec 3 sec 0.1 sec
dense (all ones) 50 sec 3 sec 3 sec

I suspect AMPL is not fully exploiting sparsity here. In general AMPL does however do a really good job in using sparse patterns during model generation with performance similar to GAMS. I have seen before models with Mathprog is just collapsing where the same model executes and generates very quickly under AMPL or GAMS. Nevertheless it is quite amazing how many models generate fast with Mathprog. This is likely because in many cases, although the final LP matrix is very large and sparse, substructures in the model can often be dense and relatively small. I.e. many models can be generated quite efficiently using dense technology for most operations. Only when the substructures become very large and sparse, better sparse technology -- i.e. loop over the nonzero elements only -- as used in GAMS (and AMPL) becomes necessary.

Friday, February 20, 2009

GAMS system directory

> Where is the GAMS system directory?

Run a model with

display "%gams.sysdir%";


Wednesday, February 18, 2009

MS Solver Foundation /Gurobi (2)

The MIP capabilities of the standard editions have been significantly reduced. In the previous version this was 2k variables and equations and now only 500 variables and equations. This is looks like a Gurobi limit:

Error:
The solver(s) threw an exception while solving the model - Gurobi throttle exceeded.
Visit www.gurobi.com/html/products.html to purchase an unlock license (HostID: 8AA3C6D2).


The obj seems to be counted as an equation:

Model[

// generate simple MIP model with 500 variables and 500 constraints
Parameters[Integers,N=500],
Decisions[Integers[0,1],Foreach[{i,N},x[i]]],

Constraints[ Foreach[{i,N},x[i] <= 1] ],

Goals[
Maximize[Sum[{i,N},x[i]]]
//Maximize[obj->Sum[{i,N},x[i]]] use to prevent funny name for goal
]

]

does trigger the above message. From the Excel plug-in I don't see an option to use the MS MIP solver. I hope I am wrong, but I suspect there is a class of MIP models that could be solved with the previous version of the Standard Edition that can no longer be solved with the current 1.1 release.

The workaround seems obvious: purchase the enterprise edition or purchase Gurobi.

Note that the limit is on any variable, i.e. 1 integer variable + 500 continuous variables makes the model too large.


Update: If your problem is in between these limits this can help. There is a way to point the excel plugin back to the MS MIP solver through a configuration file. For more info please contact the MSF people at http://code.msdn.microsoft.com/solverfoundation. They are very helpful and responsive.

Basis information in GAMS

> How do I found out which variables and slacks are basic and non-basic.

Look at the marginals.  Nonzero or EPS marginals means non-basic and zero marginals means basic. In GAMS marginals are duals for the rows and reduced costs for the columns. Of course row duals can be interpreted as reduced costs for the corresponding slacks. For instance when looking at the solution from the trnsport model we see:


                           LOWER          LEVEL          UPPER         MARGINAL

---- EQU cost . . . 1.0000 Non-basic

---- EQU supply observe supply limit at plant i

LOWER LEVEL UPPER MARGINAL

seattle -INF 350.0000 350.0000 EPS Non-basic
san-diego -INF 550.0000 600.0000 . Basic

---- EQU demand satisfy demand at market j

LOWER LEVEL UPPER MARGINAL

new-york 325.0000 325.0000 +INF 0.2250 Non-basic
chicago 300.0000 300.0000 +INF 0.1530 Non-basic
topeka 275.0000 275.0000 +INF 0.1260 Non-basic

---- VAR x shipment quantities in cases

LOWER LEVEL UPPER MARGINAL

seattle .new-york . 50.0000 +INF . Basic
seattle .chicago . 300.0000 +INF . Basic
seattle .topeka . . +INF 0.0360 Non-basic
san-diego.new-york . 275.0000 +INF . Basic
san-diego.chicago . . +INF 0.0090 Non-basic
san-diego.topeka . 275.0000 +INF . Basic

LOWER LEVEL UPPER MARGINAL

---- VAR z -INF 153.6750 +INF . Basic




This model has 6 equations and 7 variables. So we expect 6 basic rows/columns and 7 non-basic ones. This indeed is the case.

Note: this model is dual degenerate as it has an EPS marginal. This means the dual of equation supply('seattle') is numerically zero but non-basic nevertheless. This indicates presence of alternative optimal solutions.

Friday, February 13, 2009

MS Solver Foundation / Gurobi

MS Solver Foundation 1.1 is out. It includes Gurobi! (it has even become the default MIP solver). See http://code.msdn.microsoft.com/solverfoundation.

Indeed we see when running a MIP from the Excel plugin:




Thursday, February 12, 2009

TSP Powerset Formulation

> I'm a PhD student in Portugal and I´m modeling a Vehicle Routing
> Problem. I have some difficulties to modelling this subtour
> elimination constraint (n = number of nodes):

> sum(i in S, sum(j in S, x(i,j))) <= |S|-1
> for every nonempty subset S of {2,3,..,n}


This is the Ford, Fulkerson, Johnson (1954) formulation, based on the power of {2,3,..,n}. This generates a lot of constraints (2^(n-1)), with many not active in the optimal solution. So this approach only works for small problems.

$ontext

  
Burma 14 city problem using power set formulation

  
Erwin Kalvelagen, Amsterdam Optimization

  
References:
       
Gerhard Reinelt, Universitaet Heidelberg, TSPLIB95
       
A.J. Orman & H.P. Williams, A Survey of Different Integer Programming
           
Formulations of the Travelling Salesman Problem

$offtext


$eolcom //

sets
   i
/city1*city14/
   xy
/x,y/
;
alias(i,j);

table coordinates(i,xy) 'coordinates from tsplib'

          
x            y
  
city1  16.47       96.10
  
city2  16.47       94.44
  
city3  20.09       92.54
  
city4  22.39       93.37
  
city5  25.23       97.24
  
city6  22.00       96.05
  
city7  20.47       97.02
  
city8  17.20       96.29
  
city9  16.30       97.38
 
city10  14.05       98.12
 
city11  16.53       97.38
 
city12  21.52       95.59
 
city13  19.41       97.13
 
city14  20.09       94.55
;

*
* form the distance matrix
*
parameter latitude(i), longitude(i), degree(i,xy), minute(i,xy);
degree(i,xy) = trunc(coordinates(i,xy));
minute(i,xy) = coordinates(i,xy) - trunc(coordinates(i,xy));
latitude(i) = pi*(degree(i,
'x')+5.0*minute(i,'x'
)/3.0)/180.0;
longitude(i) = pi*(degree(i,
'y')+5.0*minute(i,'y'
)/3.0)/180.0;


set
nd(i,j);
nd(i,j)$(
not sameas(i,j)) = yes
;


scalar rrr /6378.388/
;
parameter
q1(i,j),q2(i,j),q3(i,j);
q1(nd(i,j)) = cos(longitude(i) - longitude(j));
q2(nd(i,j)) = cos(latitude(i) - latitude(j));
q3(nd(i,j)) = cos(latitude(i) + latitude(j));

parameter c(i,j) 'distance matrix (KM)'
;
c(nd(i,j)) = trunc(RRR * arccos( 0.5*((1.0+q1(i,j))*q2(i,j) - (1.0-q1(i,j))*q3(i,j)) ) + 1.0);
display
c;


*

* form the power set
*
set v(i) 'exclude city 1' /city2*city14/;
abort$(card(i)<>card(v)+1) 'Set v is incorrect'
,i,v;
set n 'subset id (max number of subsets)' /n1*n8192/
;

scalar nsize 'size needed for set n'
;
nsize = 2**
card
(v);
abort$(card(n)<nsize) 'set n is too small'
,nsize;

sets

  base
'used in next set' /no,yes/
  ps0(n,v,base)
/ system.powersetRight/
  ps(n,v)
'power set'
;
ps(n,v) = ps0(n,v,
'yes');


parameter
pscount(n);
pscount(n) =
sum
(ps(n,v),1);

*

* form subsets, exclude subsets with cardinality = 1
*
set nsub(n);
nsub(n)$(pscount(n)>=2) =
yes
;

*

* form ps2(n,i,j) = ps(n,i) and ps(n,j)
* this will simplify the subtour elimination constraint
*
alias(v,vv);
set
ps2(n,i,j);
ps2(nsub(n),nd(v,vv))$(ps(n,v)
and ps(n,vv)) = yes
;

*

* here is the model
*
binary variables x(i,j);
variable z 'objective variable'
;

equations

   obj 
'objective'
   assign1(i)
'assignment problem constraint'
   assign2(j)
'assignment problem constraint'
   subt(n)   
'subtour elimination'
;

obj..   z =e=
sum(nd(i,j), c(i,j)*x(i,j));
assign1(i)..
sum
(nd(i,j), x(i,j)) =e= 1;
assign2(j)..
sum
(nd(i,j), x(i,j)) =e= 1;
subt(nsub(n))..
sum
(ps2(n,i,j),x(i,j)) =L=  pscount(n)-1;

option
optcr=0;
model tsp /all/
;
solve
tsp minimizing z using mip;

Note: the construct system.powerSetRight is a new GAMS feature (24.0, 2013).

This solves quite fast:

--- Starting compilation
--- burma14.gms(122) 9 Mb
--- Starting execution: elapsed 0:00:00.028
--- burma14.gms(120) 21 Mb
--- Generating MIP model tsp
--- burma14.gms(122) 44 Mb
---   8,207 rows  183 columns  320,035 non-zeroes
---   182 discrete-columns
--- Executing CPLEX: elapsed 0:00:00.741

IBM ILOG CPLEX   24.7.3 r58181 Released Jul 11, 2016 WEI x86 64bit/MS Windows
--- GAMS/Cplex licensed for continuous and discrete problems.
Cplex 12.6.3.0

Reading data...
Starting Cplex...
Space for names approximately 0.16 Mb
Use option 'names no' to turn use of names off
Tried aggregator 1 time.
MIP Presolve eliminated 1 rows and 1 columns.
Reduced MIP has 8206 rows, 182 columns, and 319852 nonzeros.
Reduced MIP has 182 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.28 sec. (95.59 ticks)
Probing time = 0.03 sec. (15.13 ticks)
Tried aggregator 1 time.
Reduced MIP has 8206 rows, 182 columns, and 319852 nonzeros.
Reduced MIP has 182 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.28 sec. (95.59 ticks)
Probing time = 0.03 sec. (15.13 ticks)
Clique table members: 106.
MIP emphasis: balance optimality and feasibility.
MIP search method: dynamic search.
Parallel mode: none, using 1 thread.
Tried aggregator 1 time.
DUAL formed by presolve
Reduced LP has 182 rows, 8388 columns, and 320034 nonzeros.
Presolve time = 0.08 sec. (27.42 ticks)

Iteration log . . .
Iteration:     1    Objective     =            70.000000
Initializing dual steep norms . . .
Root relaxation solution time = 0.16 sec. (57.42 ticks)

        Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Gap

*     0     0      integral     0     3323.0000     3323.0000       45    0.00%
Elapsed time = 1.30 sec. (460.29 ticks, tree = 0.00 MB, solutions = 1)
Found incumbent of value 3323.000000 after 1.30 sec. (460.29 ticks)

Root node processing (before b&c):
  Real time             =    1.30 sec. (460.33 ticks)
Sequential b&c:
  Real time             =    0.00 sec. (0.00 ticks)
                          ------------
Total (root+branch&cut) =    1.30 sec. (460.33 ticks)
MIP status(101): integer optimal solution
Cplex Time: 1.30sec (det. 460.33 ticks)
Fixing integer variables, and solving final LP...
Tried aggregator 1 time.
LP Presolve eliminated 8207 rows and 183 columns.
All rows and columns eliminated.
Presolve time = 0.03 sec. (13.14 ticks)
Fixed MIP status(1): optimal
Cplex Time: 0.03sec (det. 17.45 ticks)

Proven optimal solution.

MIP Solution:         3323.000000    (45 iterations, 0 nodes)
Final Solve:          3323.000000    (0 iterations)

Best possible:        3323.000000
Absolute gap:            0.000000
Relative gap:            0.000000

--- Restarting execution
--- burma14.gms(122) 20 Mb
--- Reading solution for model tsp
--- burma14.gms(122) 20 Mb
*** Status: Normal completion
--- Job burma14.gms Stop 02/23/17 00:47:38 elapsed 0:00:02.719

image

References
  1. burma14.gms: above model
  2. burma14-3.gms: generation of powerset written in GAMS (not using system.PowerSetRight).
  3. burma14-2.gms: formulation from Svestka (J.A. Svestka, A continuous variable representation of the TSP, Math Prog, v15, 1978, pp211-213)
  4. burma14-4.gms: a cutting plane technique

Easy MIP

In this post we showed a very difficult mip model with just 217 0-1 variables. Here is a very large MIP model that shows the opposite behavior. Although it has about 230k integer variables, Cplex devours it:


---   122,284 rows  352,038 columns  667,797 non-zeroes
--- 231,136 discrete-columns

Nodes Cuts/
Node Left Objective IInf Best Integer Best Node ItCnt Gap

0 0 588575.0000 606 588575.0000 14792
0 0 588575.0000 689 Cuts: 313 16547
0 0 588575.0000 736 ZeroHalf: 14 17624
* 0+ 0 587775.0000 588575.0000 17624 0.14%
0 2 588575.0000 349 587775.0000 588575.0000 17624 0.14%
Elapsed real time = 30.14 sec. (tree size = 0.00 MB, solutions = 1)
* 10+ 1 588575.0000 588575.0000 18686 0.00%

Total turnaround time (including GAMS): 46 seconds. A good illustration that size of the problem is not a good indicator for difficulty. (Don't know much about the model as it was sent to me in LP format. There may be many variables automatically integer-valued).

Conventions

I doubt that the message from GAMS/COINCBC:

Problem statistics: 352037 columns and 122283 rows.
                    231136 variables have integrality restrictions.

actually is written by code from John Forrest. I guess he would write rows followed by columns (that is how I would have presented this message). Indeed further down we see messages like:

processed model has 50182 rows, 152495 columns (152495 integer) and 429919 elements

It is interesting to note how large a role such conventions play. No one would write a[j,i] unless for a specific reason. When I was teaching numerical programming, I always asked students to redo their code if it contained:

var x : integer;

The strict use of such conventions make things more predictable and thus easier to read quickly.

Monday, February 9, 2009

Set Covering Variant

> I need to model a variant of the set covering problem:
> we need to cover at least p% of the points.

The set covering problem looks like:

param a{I,J},binary;
var x{J},binary;

minimize cost: sum{j in J} x[j];
subject to covers{i in I}: sum{j in J} a[i,j]*x[j] >= 1;

We can formulate your problem with:

param a{I,J},binary;
param p >= 0, <= 1;
var x{J},binary;
var y{I} >= 0, <= 1;
minimize cost: sum{j in J} x[j];
subject to covers{i in I}: sum{j in J} a[i,j]*x[j] >= y[i];

subject to req: sum{i in I} y[i] >= p*card(I);

I believe y does not have to be binary and can be relaxed (between 0 and 1).

Solver Status

There are two cases when it is really important that the Model Status and Solver Status is set correctly by the solver:
  1. When building algorithms in GAMS. In this case the progress and flow of control is determined by the return codes from the SOLVE statement.
  2. When building user interfaces around a model. In this case we want to provide informative messages to the user.
  3. Or both (this was my case: solve is both embedded in an algorithm and I need to report useful messages to the user).

When relying on the Model Status and Solver Status in these cases, we assume that the solver is predictable. Here is a case where the GAMS/Cplex link is sloppy in reporting an iteration limit:


               S O L V E      S U M M A R Y

MODEL m OBJECTIVE z
TYPE MIP DIRECTION MINIMIZE
SOLVER CPLEX FROM LINE 27

**** SOLVER STATUS 4 TERMINATED BY SOLVER
**** MODEL STATUS 14 NO SOLUTION RETURNED
**** OBJECTIVE VALUE 0.0000

RESOURCE USAGE, LIMIT 226.084 1000.000
ITERATION COUNT, LIMIT 0 100000

ILOG CPLEX Dec 1, 2008 22.9.2 WIN 7311.8080 VIS x86/MS Windows
Cplex 11.2.0, GAMS Link 34
Cplex licensed for 1 use of lp, qp, mip and barrier, with 4 parallel threads.

MIP status(110): error termination, no integer solution
*** CPLEX Error 3019: Failure to solve MIP subproblem.

Error solving MIP subproblem.
Solution aborted due to iteration limit.

The correct solver status would be 2:

**** SOLVER STATUS 2 ITERATION INTERRUPT

Note also that the iteration count is not returned correctly. The link claims zero iterations have been performed. The correct number is 100000. This also means M.ITERUSD is wrong and can not be relied on when reporting back.

Thursday, February 5, 2009

floor in MINLP model

> My MINLP model (in AMPL) does not solve: 
> the relaxed NLP is declared infeasible.
> The model contains the floor() function.


Unless you are using a global solver, MINLP solvers assume the relaxations (NLPs formed by fixing or relaxing the integer variables) are smooth. The floor function is discontinuous, so it is dangerous to use. There is a simple reformulation however, which moves the complexity from the NLP to the MIP part:

var x;
var flx integer;

flx ≤ x ≤ flx + 1

To get slightly more predictable behavior when x is (close to) integer, use:

flx ≤ x ≤ flx + 0.9999

In AMPL you may want to rewrite this as:

0 ≤ x − flx ≤ 0.9999

as this can be written as a single constraint. 

This formulation adds an integer variable, but has substantially simplified the NLP: it just adds a linear inequality and removes a discontinuous function.

Monday, February 2, 2009

Social Golfer Problem in OML (2)

In the earlier post we used binary variables x[i,g,t] to model that golfer i plays in group g at round t. Here we show that using CSP constructs it is actually easy to use a different formulation:  x[i,t]=g. I.e. x[i,t] is an integer variable. In a MIP context this would be much more difficult to do.


Model[

// N : number of golfers
// NG : number of groups
// T : number of rounds
// GS : group size (N/NG)

Parameters[Integers,N=16,NG=4,T=5,GS=4],
// Parameters[Integers,N=32,NG=8,T=10,GS=4],

Decisions[
Integers[0,NG-1],
Foreach[{i,N},{t,T},x[i,t]]
],


Constraints[

// form groups
Foreach[{g,NG},{t,T},Sum[{i,N},AsInt[x[i,t]==g]] == GS],

// golfer i and j meet at most one time
Foreach[{i,N}, {j,i+1,N}, Sum[{t,T},AsInt[x[i,t]==x[j,t]]] <= 1],

// fix first round
Foreach[{g,NG},{k,GS},x[GS*g+k,0]==g],

// fix first golfer
Foreach[{t,1,T},x[0,t]==0]
]

]

Sunday, February 1, 2009

GAMS $eval trunc

The use of the trunc function in an evaluation macro:

$eval m trunc(2.5)

leads to an error in GAMS. This has been reported and fixed the same day (on a weekend day)! The next version of GAMS (23.0) that will be out very soon will contain the fix.

A simple workaround is to use the floor function. For positive x, trunc(x)=floor(x). If x is negative, trunc(x)=ceil(x).