Yet Another Math Programming Consultant: June 2011

Thursday, June 30, 2011

Cplex polishing

For some difficult large models, the Polishing facility in Cplex can sometimes do extremely well. This is an example:

Elapsed real time = 874.90 sec. (tree size = 11.31 MB, solutions = 2)
    233   235     -158.9737 3551      621.0895     -159.3086   258914 125.65%
    234   236     -158.9448 3578      621.0895     -159.3086   258964 125.65%
    266   268     -158.9498 3578      621.0895     -159.3086   265158 125.65%
    306   308     -158.9338 3568      621.0895     -159.3086   271534 125.65%
    310   312     -158.9315 3511      621.0895     -159.3086   272354 125.65%
    312   314     -158.9489 3581      621.0895     -159.3086   272367 125.65%
    322   324     -158.9310 3574      621.0895     -159.3086   273646 125.65%
    346   348     -158.8857 3561      621.0895     -159.3086   277217 125.65%
    354   356     -158.8368 3532      621.0895     -159.3086   278697 125.65%
    370   372     -158.8741 3567      621.0895     -159.3086   281105 125.65%
Elapsed real time = 995.15 sec. (tree size = 49.71 MB, solutions = 2)

Starting condition for polishing reached (PolishAfterTime).
Starting solution polishing.

*   390+ 390                          360.3876     -159.3086   285698 144.20%
*   390+ 390                          231.7287     -159.3086   285698 168.75%
*   390+ 390                          100.2251     -159.3086   285698 258.95%
    390   392     -158.8957 3503      100.2251     -159.3086   285698 258.95%
    391   393     -158.9880 3601      100.2251     -159.3086   286729 258.95%
    392   394     -158.9935 3613      100.2251     -159.3086   286790 258.95%
    393   395     -158.9735 3574      100.2251     -159.3086   287216 258.95%
    394   396     -158.9763 3595      100.2251     -159.3086   287247 258.95%
    395   397     -158.9980 3601      100.2251     -159.3086   287328 258.95%
    396   398     -158.9806 3600      100.2251     -159.3086   287599 258.95%
    397   399     -159.0641 3615      100.2251     -159.3086   296751 258.95%

GUB cover cuts applied: 854
Clique cuts applied: 14
Cover cuts applied: 1540
Implied bound cuts applied: 20
Flow cuts applied: 922
Mixed integer rounding cuts applied: 67
Zero-half cuts applied: 1169
Gomory fractional cuts applied: 236

Root node processing (before b&c):
Real time             = 127.44
Parallel b&c, 8 threads:
Real time             = 7070.70
Sync time (average)   = 168.52
Wait time (average)   =    0.00
                          -------
Total (root+branch&cut) = 7198.14 sec.
MIP status(107): time limit exceeded

Although the relative gap looks bad, this is in fact a very good solution. The reason for the gap problem is that we are relatively close to zero in our objective (compared to solutions found by heuristics) and the fact that we are crossing zero in the gap calculation does not help.

Saturday, June 18, 2011

Advanced basis with Cplex

Usually Cplex behaves very predictable with LP’s that have an advanced basis to start from. But now I have something that puzzles me. Basically I do a solve from an optimal solution. This should normally run in just a few iterations. But with the default setting I see:

First model

Iteration log . . .
Iteration:     1   Dual objective     =             1.005320
Iteration:   581   Dual objective     =           296.751221
Iteration: 1247   Dual objective     =           511.623938
Iteration: 1681   Dual objective     =           514.072257
LP status(1): optimal

Restart

Iteration log . . .
Iteration:     1    Objective     =           515.104554
Perturbation started.
Iteration:   402    Objective     =           515.104554
Iteration:   806    Objective     =           515.104088
Iteration: 1107    Objective     =           515.103794
Iteration: 1406    Objective     =           515.103557
Iteration: 1684    Objective     =           515.103314
Iteration: 1939    Objective     =           515.103216
Iteration: 2186    Objective     =           515.103124
Iteration: 2434    Objective     =           515.103037
Iteration: 2672    Objective     =           515.102875
Iteration: 2951    Objective     =           515.102790
Iteration: 3210    Objective     =           515.102730
Iteration: 3472    Objective     =           515.102670
Iteration: 3709    Objective     =           515.102630
Iteration: 3943    Objective     =           515.102592
Iteration: 4199    Objective     =           515.102525
Iteration: 4445    Objective     =           515.102493
Iteration: 4692    Objective     =           515.102468
Iteration: 4933    Objective     =           515.102448
Iteration: 5181    Objective     =           515.102428
Iteration: 5431    Objective     =           515.102411
Iteration: 5689    Objective     =           515.102395
Iteration: 5962    Objective     =           515.102379
Iteration: 6224    Objective     =           515.102362
Iteration: 6487    Objective     =           515.102353
Iteration: 6765    Objective     =           515.102342
Iteration: 7041    Objective     =           515.102335
Removing perturbation.
LP status(1): optimal

The restart from an optimal basis takes more iterations than solving from scratch. This is with a default setting of LPMETHOD=0 (automatic). When I use LPMETHOD=1 (Primal Simplex) or LPMETHOD=2 (Dual Simplex) I see:

Restart with LPMETHOD=1

Primal crossover.
Primal: Fixed no variables.
Dual: Fixing 993 variables.
      992 DMoves: Infeasibility 6.76704726e+000 Objective 5.15104554e+002
      394 DMoves: Infeasibility 1.33823843e+000 Objective 5.15104554e+002
        0 DMoves: Infeasibility 7.17019911e-001 Objective 5.15104554e+002
Dual: Pushed 0, exchanged 993.

Iteration log . . .
Iteration: 1 Objective = 515.104554
LP status(1): optimal

Restart with LPMETHOD=2

Dual crossover.
Dual: Fixing 994 variables.
      993 DMoves: Infeasibility 2.07167616e-013 Objective 5.15104554e+002
      414 DMoves: Infeasibility 7.97140132e-014 Objective 5.15104554e+002
        0 DMoves: Infeasibility 0.00000000e+000 Objective 5.15104554e+002
Dual: Pushed 1, exchanged 993.
Primal: Fixed no variables.
LP status(1): optimal

These seem indeed to take fewer iterations to solve the restarted model. I always assumed LPMETHOD=0 would just choose primal or dual simplex, but apparently it also does other things, and in this case that is not helping.

Update

I did some more experiments. With LPMETHOD=0 I can now also achieve a restart if I set all variable levels to zero. I.e.

solve m minimizing z using lp;
solve m minimizing z using lp;

gives a bad restart, while:

solve m minimizing z using lp;
x.l(i,j) = 0;
z.l=0;
solve m minimizing z using lp;

gives me zero iterations for the restart. Something is not completely right. As Bo suggested in his comments, the reason may be related to Cplex trying to do a presolve on an advanced basis. (I have heard about some ancient solvers that tried a basis preserving presolve, may be more limited than what current presolvers do). The difference between LPMETHOD=0 and both LPMETHOD=1 and LPMETHOD=2 is still a mystery to me.

Update 2

The behavior can be partially explained by looking at the Cplex documentation describing the Advind parameter:

For continuous models solved with simplex, setting 1 (one) will use the currently loaded basis. If a basis is available only for the original, unpresolved model, or if CPLEX has a start vector rather than a simplex basis, then the simplex algorithm will proceed on the unpresolved model. With setting 2, CPLEX will first perform presolve on the model and on the basis or start vector, and then proceed with optimization on the presolved problem.

I was using the model from a GAMS environment. There the default is 2. The documentation for GAMS/Cplex is somewhat more limited than the above paragraph:

Advind (Integer)
Use an Advanced Basis. GAMS/Cplex will automatically use an advanced basis from a previous solve
statement. The GAMS Bratio option can be used to specify when not to use an advanced basis. The Cplex
option advind can be used to ignore a basis passed on by GAMS (it overrides Bratio).
(default = determined by GAMS Bratio)
0 Do not use advanced basis
1 Use advanced basis if available
2 Crash an advanced basis if available (use basis with presolve)

Looks like 1 is better for a very good basis, 2 may be better for models with a basis that is not very good (in that case presolving will pay off even when destroying some basis information).

Friday, June 17, 2011

Network formulation

I was looking at a model containing underlying traffic network. Such a network can be quite elegantly formulated in GAMS, but apparently many users still have problems with this. The mathematical formulation is:

The GAMS formulation can look like:

$ontext
max flow network example

Data from example in
     Mitsuo Gen, Runwei Cheng, Lin Lin
     Network Models and Optimization: Multiobjective Genetic Algorithm Approach
     Springer, 2008

Erwin Kalvelagen,
Amsterdam Optimization,
May 2008

$offtext

sets
i 'nodes' /node1*node11/
source(i) /node1/
sink(i) /node11/
;
alias(i,j);

abort$(card(source)<>1) "We want one source node";
abort$(card(sink)<>1) "We want one sink node";

parameter capacity(i,j) /
   node1.node2 60,   node1.node3 60,   node1.node4 60,   node2.node3 30
   node2.node5 40,   node2.node6 30,   node3.node4 30,   node3.node6 50
   node3.node7 30,   node4.node7 40,   node5.node8 60,   node6.node5 20
   node6.node8 30,   node6.node9 40,   node6.node10 30, node7.node6 20
   node7.node10 40, node8.node9 30,   node8.node11 60, node9.node10 30
   node9.node11 50, node10.node11 50 /;

set arcs(i,j);
arcs(i,j)$capacity(i,j) = yes;
display arcs;

parameter rhs(i);
rhs(source) = -1;
rhs(sink) = 1;

variables
x(i,j) 'flow along arcs'
f      'total flow'
;
positive variables x;
x.up(i,j) = capacity(i,j);

equations
   flowbal(i) 'flow balance' ;

flowbal(i)..
   sum(arcs(j,i), x(j,i)) - sum(arcs(i,j), x(i,j)) =e= f*rhs(i);

model m/flowbal/;
solve m maximizing f using lp;

Tuesday, June 14, 2011

MIP Gap not decreasing

Here is a small fragment of a Cplex log:

Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap

   3186 2520     -666.6217 4096      956.6330     -667.2010 1313338 169.74%
   3226 2560     -666.6205 4097      956.6330     -667.2010 1323797 169.74%
   3266 2600     -666.6201 4095      956.6330     -667.2010 1335602 169.74%
Elapsed real time = 2801.61 sec. (tree size = 77.54 MB, solutions = 2)
* 3324+ 2656                         -125.5775     -667.2010 1363079 431.31%
   3334 2668     -666.5811 4052     -125.5775     -667.2010 1370748 431.31%
   3380 2714     -666.5799 4017     -125.5775     -667.2010 1388391 431.31%
   3422 2756     -666.5791 4011     -125.5775     -667.2010 1403440 431.31%

Usually we expect the gap to decrease while the MIP solver is working its way. Here we see a different behavior. Because we go through zero the gap% is actually increasing here.

I believe the gap calculation used by Cplex is:

Obviously this is sensitive w.r.t. to the sign of BestInteger. Of course in practice we don’t see this behavior very often (in many models BestInteger and BestNode have the same sign).

Tuesday, June 7, 2011

Writing solutions back into a database

The issue of writing optimization results back into a database is often a little bit more complex than reading input data (see http://yetanothermathprogrammingconsultant.blogspot.com/2011/04/live-databases.html for some thoughts on reading data).

First with writing more things can go wrong: you need additional permissions, you can overwrite data, and conceptual there are some issues. Here are some questions and possibilities that come to mind:

Do we have a say in how the tables look like or do we need to follow some existing data model?
Update or Insert. Do we use an SQL INSERT or SQL UPDATE to populate solution tables?
Some databases have SQL extensions that implement an UPSERT. E.g. MySQL has INSERT INTO … ON DUPLICATE KEY UPDATE, or REPLACE INTO, Oracle and SQL Server have MERGE.
It may be better to erase the solution tables first.
Or always INSERT with a new key indicating the run id. A new solution gets a new run id.
For very large data some RDBMS systems have a BULK INSERT facility (SQL Server: BCP or BULK INSERT, SQL*Loader for Oracle).
We can write to temporary tables and deal with the problem of merging results into the real tables further down the road.
Just write a well-designed temp database (e.g. in Access) and let the IT people deal with getting the data into their systems.

In general I prefer 8, let others take responsibility for plugging things into the database. Note that this step can be expensive. A few weeks ago I was discussing a complex model where we needed to get the results into the RDBMS. To achieve a reasonable turnaround time, I was given a time limit of about an hour to come up with solutions for the model, while the system guy allocated 2 hours for himself to put the solution back into the database (much time is spent on checks).

Sunday, June 5, 2011

Remote Desktop adventure

Suddenly my remote desktop window went black. Even after reconnecting, I got again just a black screen. Took me a little while to figure out that with Ctrl-Alt-End I could send a Ctrl-Alt-Del to the remote session, so I could log out. That solved the problem: after logging on again everything was fine.