Saturday, April 25, 2009

GLPK: gams vs glpsol

The GAMS version of GLPK seems numerically less stable than the standalone GLPK version (at least for this case). Here is a GAMS model, where we used an option file to not use a final LP to get duals:

$ontext

    Alphametics, Simple formulation
    Erwin Kalvelagen, 1998

     G E O R G I A
       O R E G O N
     V E R M O N T
   ---------------
   V I R G I N I A

$offtext

set i /G,E,O,R,I,A,N,V,M,T/;
alias (i,j);

abort$(card(i) <> 10) "set i should have 10 elements";

parameter v(i); v(i) = ord(i)-1;

variables
     y(i) 'decision variables'
     x(i,j) 'auxiliary variables for uniqueness'
     z 'dummy objective variable'
;
binary variable x;

equation
     addition 'the actual problem'
     ydef(i) 'calculate y'
     xrow(i) 'row sums'
     xcol(j) 'column sums'
     dummy_objective
;

addition..
              1e6*y('G') + 1e5*y('E') + 1e4*y('O') + 1e3*y('R') + 100*y('G') + 10*y('I') + y('A') +
                           1e5*y('O') + 1e4*y('R') + 1e3*y('E') + 100*y('G') + 10*y('O') + y('N') +
              1e6*y('V') + 1e5*y('E') + 1e4*y('R') + 1e3*y('M') + 100*y('O') + 10*y('N') + y('T') =e=
1e7*y('V') + 1e6*y('I') + 1e5*y('R') + 1e4*y('G') + 1e3*y('I') + 100*y('N') + 10*y('I') + y('A');

ydef(i).. y(i) =e= sum(j, x(i,j)*v(j));
xrow(i).. sum(j, x(i,j)) =e= 1;
xcol(j).. sum(i, x(i,j)) =e= 1;

*
* bounds on leading digits: they can not be 0
*
y.lo('G') = 1; y.lo('V') = 1;

dummy_objective.. z =e= sum(i, y(i));

model m /all/;
m.optfile=1;
solve m using mip minimizing z;

option y:0;
display y.l;

$onecho > coinglpk.opt
solvefinal=0
$offecho

This gives:

**** SOLVER STATUS     1 NORMAL COMPLETION        
**** MODEL STATUS      8 INTEGER SOLUTION         
**** OBJECTIVE VALUE               45.0000

---- VAR y  decision variables

         LOWER          LEVEL          UPPER         MARGINAL

G         1.0000         1.0000        +INF            EPS     NOPT
E        -INF             .            +INF            EPS     NOPT
O        -INF             .            +INF            EPS     NOPT
R        -INF             .            +INF            EPS     NOPT
I        -INF             .            +INF            EPS     NOPT
A        -INF             .            +INF            EPS     NOPT
N        -INF             .            +INF            EPS     NOPT
V         1.0000         1.0000        +INF            EPS     NOPT
M        -INF             .            +INF            EPS     NOPT
T        -INF             .            +INF            EPS     NOPT

This solution is not correct. I suspect the NOPTS are just the result of a bug in GAMS link. As explained in a previous posting this behavior may be because of scaling. The log file looks like:

--- Job alphametic2.gms Start 04/25/09 19:02:59 WEX-VIS 23.0.2 x86/MS Windows       
GAMS Rev 230  Copyright (C) 1987-2009 GAMS Development. All rights reserved
Licensee: Erwin Kalvelagen                               G090213/0001CV-WIN
          Amsterdam Optimization Modeling Group                      DC4572
--- Starting compilation
--- alphametic2.gms(63) 3 Mb
--- Starting execution: elapsed 0:00:00.004
--- alphametic2.gms(55) 4 Mb
--- Generating MIP model m
--- alphametic2.gms(56) 4 Mb
---   32 rows  111 columns  320 non-zeroes
---   100 discrete-columns
--- Executing COINGLPK: elapsed 0:00:00.009

GAMS/CoinGlpk LP/MIP Solver (Glpk Library 4.32 )
written by A. Makhorin
Reading parameter(s) from "C:\projects\test\coinglpk.opt"
>>  solvefinal=0
Finished reading from "C:\projects\test\coinglpk.opt"

Starting GLPK Branch and Bound...
ipp_basic_tech:  0 row(s) and 0 column(s) removed
ipp_reduce_bnds: 4 pass(es) made, 14 bound(s) reduced
ipp_basic_tech:  0 row(s) and 3 column(s) removed
ipp_reduce_coef: 1 pass(es) made, 0 coefficient(s) reduced
glp_intopt: presolved MIP has 31 rows, 107 columns, 300 non-zeros
glp_intopt: 97 integer columns, all of which are binary
Scaling...
A: min|aij| = 1.000e+000  max|aij| = 9.000e+006  ratio = 9.000e+006
GM: min|aij| = 1.304e-001  max|aij| = 7.666e+000  ratio = 5.877e+001
EQ: min|aij| = 1.707e-002  max|aij| = 1.000e+000  ratio = 5.858e+001
2N: min|aij| = 9.766e-003  max|aij| = 1.500e+000  ratio = 1.536e+002
Crashing...
Size of triangular part = 29
Solving LP relaxation...
      0: obj = -1.766865672e+002  infeas = 1.380e+003 (2)
*    37: obj =  4.500000000e+001  infeas = 3.052e-014 (1)
OPTIMAL SOLUTION FOUND
Integer optimization begins...
Gomory's cuts enabled
Cover cuts enabled
Clique cuts enabled
Creating the conflict graph...
The conflict graph has 2*97 vertices and 955 edges
+    37: mip =     not found yet >=              -inf        (1; 0)
+ 62738: mip =     not found yet >=  4.500000000e+001        (231; 4046)
+ 64138: >>>>>  4.500000000e+001 >=  4.500000000e+001 < 0.1 (236; 4157)
+ 64138: mip =  4.500000000e+001 >=  4.500000000e+001 < 0.1 (236; 4157)
RELATIVE MIP GAP TOLERANCE REACHED; SEARCH TERMINATED

Solving fixed problem skipped.

Integer Solution.
Best solution:                   45   (64138 iterations, 5.137 seconds)
Writing primal solution. Objective: 45 Time: 5.137 s
--- Restarting execution
--- alphametic2.gms(56) 0 Mb
--- Reading solution for model m
--- Executing after solve: elapsed 0:00:05.203
--- alphametic2.gms(59) 3 Mb
*** Status: Normal completion
--- Job alphametic2.gms Stop 04/25/09 19:03:04 elapsed 0:00:05.204

The same model formulated in AMPL or rather GNU Mathprog looks like:

set I;
var x{i in I, j in I}, binary;
var y{i in I}, >= 0, <= 9;
param v{i in I};
minimize obj: sum{i in I} y[i];
addition:
     1000000*y['G'] + 100000*y['E'] + 10000*y['O'] + 1000*y['R'] + 100*y['G'] + 10*y['I'] + y['A'] +
        100000*y['O'] + 10000*y['R'] + 1000*y['E'] + 100*y['G'] + 10*y['O'] + y['N'] +
        1000000*y['V'] + 100000*y['E'] + 10000*y['R'] + 1000*y['M'] + 100*y['O'] + 10*y['N'] + y['T'] =
     10000000*y['V'] + 1000000*y['I'] + 100000*y['R'] + 10000*y['G'] + 1000*y['I'] + 100*y['N'] + 10*y['I'] + y['A'];
ydef{i in I}: y[i] = sum{j in I} x[i,j]*v[j];
xrow{i in I}: sum{j in I} x[i,j] = 1;
xcol{j in I}: sum{i in I} x[i,j] = 1;
bound1: y['G']>=1;
bound2: y['V']>=1;
solve;
display y;
data;
set I := G E O R I A N V M T;
param v := G 0  E 1  O 2  R 3  I 4  A 5  N 6  V 7  M 8  T 9;
end;

The result looks much better:

>cmd /c path=%path%;C:\projects\glpk\glpk\gusek&& glpsol.exe --cover --clique --gomory --mir -m "alphametic.mod"    && if not '00'=='00' gusek.exe 
Reading model section from alphametic.mod...
Reading data section from alphametic.mod...
21 lines were read
Generating obj...
Generating addition...
Generating ydef...
Generating xrow...
Generating xcol...
Generating bound1...
Generating bound2...
Model has been successfully generated
ipp_basic_tech:  3 row(s) and 0 column(s) removed
ipp_reduce_bnds: 5 pass(es) made, 31 bound(s) reduced
ipp_basic_tech:  3 row(s) and 27 column(s) removed
ipp_reduce_coef: 1 pass(es) made, 0 coefficient(s) reduced
glp_intopt: presolved MIP has 28 rows, 83 columns, 230 non-zeros
glp_intopt: 74 integer columns, all of which are binary
Scaling...
A: min|aij| = 1.000e+000  max|aij| = 1.001e+006  ratio = 1.001e+006
GM: min|aij| = 1.619e-001  max|aij| = 6.178e+000  ratio = 3.817e+001
EQ: min|aij| = 2.673e-002  max|aij| = 1.000e+000  ratio = 3.741e+001
2N: min|aij| = 1.563e-002  max|aij| = 1.250e+000  ratio = 8.000e+001
Crashing...
Size of triangular part = 26
Solving LP relaxation...
      0: obj =  1.008907291e+001  infeas = 2.123e+002 (2)
*    33: obj =  4.500000000e+001  infeas = 4.764e-015 (1)
OPTIMAL SOLUTION FOUND
Integer optimization begins...
Gomory's cuts enabled
MIR cuts enabled
Cover cuts enabled
Clique cuts enabled
Creating the conflict graph...
The conflict graph has 2*74 vertices and 784 edges
+    33: mip =     not found yet >=              -inf        (1; 0)
+ 20878: >>>>>  4.500000000e+001 >=  4.500000000e+001   0.0% (34; 1002)
+ 20878: mip =  4.500000000e+001 >=     tree is empty   0.0% (0; 1181)
INTEGER OPTIMAL SOLUTION FOUND
Time used:   3.6 secs
Memory used: 1.0 Mb (1014952 bytes)
Display statement at line 17
y[G] = 8
y[E] = 4.00000000000001
y[O] = 5.99999999999998
y[R] = 4.99999999999998
y[I] = 0
y[A] = 2.00000000000001
y[N] = 3
y[V] = 1
y[M] = 9
y[T] = 7
Model has been successfully processed
>Exit code: 0

This solution is correct. At this moment I have no explanation why the GAMS version is doing worse. There may be issues with compilers, GAMS handling of the objective or other possible reasons. There is a difference in the log indicating the scaling is done differently. I believe the standalone version is 4.34 (Dec 2008) compared to 4.32 (built Feb 12 2009) for the GAMS version, so the GAMS system was little bit outdated. Also note that I used a 64 bit GAMS system and a 32 bit GLPSOL system. It remains very treacherous to just compare results from a numerical algorithm, but in this case we expect things to be more equal than they are now, and it may be possible we stumbled upon a bug.