Thursday, June 30, 2011

Cplex polishing

For some difficult large models, the Polishing facility in Cplex can sometimes do extremely well. This is an example:

Elapsed real time = 874.90 sec. (tree size = 11.31 MB, solutions = 2)
    233   235     -158.9737  3551      621.0895     -159.3086   258914  125.65%
    234   236     -158.9448  3578      621.0895     -159.3086   258964  125.65%
    266   268     -158.9498  3578      621.0895     -159.3086   265158  125.65%
    306   308     -158.9338  3568      621.0895     -159.3086   271534  125.65%
    310   312     -158.9315  3511      621.0895     -159.3086   272354  125.65%
    312   314     -158.9489  3581      621.0895     -159.3086   272367  125.65%
    322   324     -158.9310  3574      621.0895     -159.3086   273646  125.65%
    346   348     -158.8857  3561      621.0895     -159.3086   277217  125.65%
    354   356     -158.8368  3532      621.0895     -159.3086   278697  125.65%
    370   372     -158.8741  3567      621.0895     -159.3086   281105  125.65%
Elapsed real time = 995.15 sec. (tree size = 49.71 MB, solutions = 2)

Starting condition for polishing reached (PolishAfterTime).
Starting solution polishing.

*   390+  390                          360.3876     -159.3086   285698  144.20%
*   390+  390                          231.7287     -159.3086   285698  168.75%
*   390+  390                          100.2251     -159.3086   285698  258.95%
    390   392     -158.8957  3503      100.2251     -159.3086   285698  258.95%
    391   393     -158.9880  3601      100.2251     -159.3086   286729  258.95%
    392   394     -158.9935  3613      100.2251     -159.3086   286790  258.95%
    393   395     -158.9735  3574      100.2251     -159.3086   287216  258.95%
    394   396     -158.9763  3595      100.2251     -159.3086   287247  258.95%
    395   397     -158.9980  3601      100.2251     -159.3086   287328  258.95%
    396   398     -158.9806  3600      100.2251     -159.3086   287599  258.95%
    397   399     -159.0641  3615      100.2251     -159.3086   296751  258.95%

GUB cover cuts applied:  854
Clique cuts applied:  14
Cover cuts applied:  1540
Implied bound cuts applied:  20
Flow cuts applied:  922
Mixed integer rounding cuts applied:  67
Zero-half cuts applied:  1169
Gomory fractional cuts applied:  236

Root node processing (before b&c):
  Real time             =  127.44
Parallel b&c, 8 threads:
  Real time             = 7070.70
  Sync time (average)   =  168.52
  Wait time (average)   =    0.00
                          -------
Total (root+branch&cut) = 7198.14 sec.
MIP status(107): time limit exceeded

Although the relative gap looks bad, this is in fact a very good solution. The reason for the gap problem is that we are relatively close to zero in our objective (compared to solutions found by heuristics) and the fact that we are crossing zero in the gap calculation does not help.

Saturday, June 18, 2011

Advanced basis with Cplex

Usually Cplex behaves very predictable with LP’s that have an advanced basis to start from. But now I have something that puzzles me. Basically I do a solve from an optimal solution. This should normally run in just a few iterations. But with the default setting I see:

First model

Iteration log . . .
Iteration:     1   Dual objective     =             1.005320
Iteration:   581   Dual objective     =           296.751221
Iteration:  1247   Dual objective     =           511.623938
Iteration:  1681   Dual objective     =           514.072257
LP status(1): optimal

Restart

Iteration log . . .
Iteration:     1    Objective     =           515.104554
Perturbation started.
Iteration:   402    Objective     =           515.104554
Iteration:   806    Objective     =           515.104088
Iteration:  1107    Objective     =           515.103794
Iteration:  1406    Objective     =           515.103557
Iteration:  1684    Objective     =           515.103314
Iteration:  1939    Objective     =           515.103216
Iteration:  2186    Objective     =           515.103124
Iteration:  2434    Objective     =           515.103037
Iteration:  2672    Objective     =           515.102875
Iteration:  2951    Objective     =           515.102790
Iteration:  3210    Objective     =           515.102730
Iteration:  3472    Objective     =           515.102670
Iteration:  3709    Objective     =           515.102630
Iteration:  3943    Objective     =           515.102592
Iteration:  4199    Objective     =           515.102525
Iteration:  4445    Objective     =           515.102493
Iteration:  4692    Objective     =           515.102468
Iteration:  4933    Objective     =           515.102448
Iteration:  5181    Objective     =           515.102428
Iteration:  5431    Objective     =           515.102411
Iteration:  5689    Objective     =           515.102395
Iteration:  5962    Objective     =           515.102379
Iteration:  6224    Objective     =           515.102362
Iteration:  6487    Objective     =           515.102353
Iteration:  6765    Objective     =           515.102342
Iteration:  7041    Objective     =           515.102335
Removing perturbation.
LP status(1): optimal

The restart from an optimal basis takes more iterations than solving from scratch. This is with a default setting of LPMETHOD=0 (automatic). When I use LPMETHOD=1 (Primal Simplex) or LPMETHOD=2 (Dual Simplex) I see:

Restart with LPMETHOD=1

Primal crossover.
  Primal:  Fixed no variables.
  Dual:  Fixing 993 variables.
      992 DMoves:  Infeasibility 6.76704726e+000  Objective 5.15104554e+002
      394 DMoves:  Infeasibility 1.33823843e+000  Objective 5.15104554e+002
        0 DMoves:  Infeasibility 7.17019911e-001  Objective 5.15104554e+002
  Dual:  Pushed 0, exchanged 993.

Iteration log . . .
Iteration:     1    Objective     =           515.104554
LP status(1): optimal

Restart with LPMETHOD=2

Dual crossover.
  Dual:  Fixing 994 variables.
      993 DMoves:  Infeasibility 2.07167616e-013  Objective 5.15104554e+002
      414 DMoves:  Infeasibility 7.97140132e-014  Objective 5.15104554e+002
        0 DMoves:  Infeasibility 0.00000000e+000  Objective 5.15104554e+002
  Dual:  Pushed 1, exchanged 993.
  Primal:  Fixed no variables.
LP status(1): optimal

These seem indeed to take fewer iterations to solve the restarted model. I always assumed LPMETHOD=0 would just choose primal or dual simplex, but apparently it also does other things, and in this case that is not helping.

Update

I did some more experiments. With LPMETHOD=0 I can now also achieve a restart if I set all variable levels to zero. I.e.

solve m minimizing z using lp;
solve
m minimizing z using lp;

gives a bad restart, while:

solve m minimizing z using lp;
x.l(i,j) = 0;
z.l=0;

solve
m minimizing z using lp;

gives me zero iterations for the restart. Something is not completely right. As Bo suggested in his comments, the reason may be related to Cplex trying to do a presolve on an advanced basis. (I have heard about some ancient solvers that tried a basis preserving presolve, may be more limited than what current presolvers do). The difference between LPMETHOD=0 and both LPMETHOD=1 and LPMETHOD=2 is still a mystery to me.

Update 2

The behavior can be partially explained by looking at the Cplex documentation describing the Advind parameter:

For continuous models solved with simplex, setting 1 (one) will use the currently loaded basis. If a basis is available only for the original, unpresolved model, or if CPLEX has a start vector rather than a simplex basis, then the simplex algorithm will proceed on the unpresolved model. With setting 2, CPLEX will first perform presolve on the model and on the basis or start vector, and then proceed with optimization on the presolved problem.

I was using the model from a GAMS environment. There the default is 2. The documentation for GAMS/Cplex is somewhat more limited than the above paragraph:

Advind (Integer)
Use an Advanced Basis. GAMS/Cplex will automatically use an advanced basis from a previous solve
statement. The GAMS Bratio option can be used to specify when not to use an advanced basis. The Cplex
option advind can be used to ignore a basis passed on by GAMS (it overrides Bratio).
(default = determined by GAMS Bratio)
0 Do not use advanced basis
1 Use advanced basis if available
2 Crash an advanced basis if available (use basis with presolve)

Looks like 1 is better for a very good basis, 2 may be better for models with a basis that is not very good (in that case presolving will pay off even when destroying some basis information).

Friday, June 17, 2011

Network formulation

I was looking at a model containing underlying traffic network. Such a network can be quite elegantly formulated in GAMS, but apparently many users still have problems with this. The mathematical formulation is:

image

The GAMS formulation can look like:

$ontext
 
max flow network example

 
Data from example in
    
Mitsuo Gen, Runwei Cheng, Lin Lin
    
Network Models and Optimization: Multiobjective Genetic Algorithm Approach
    
Springer, 2008

 
Erwin Kalvelagen,
 
Amsterdam Optimization,
 
May 2008

$offtext

sets
  i
'nodes' /node1*node11/
  source(i)
/node1/
  sink(i)
/node11/
;
alias(i,j);

abort$(card(source)<>1) "We want one source node"
;
abort$(card(sink)<>1) "We want one sink node"
;

parameter capacity(i,j) /

  
node1.node2 60,   node1.node3 60,   node1.node4 60,   node2.node3 30
  
node2.node5 40,   node2.node6 30,   node3.node4 30,   node3.node6 50
  
node3.node7 30,   node4.node7 40,   node5.node8 60,   node6.node5 20
  
node6.node8 30,   node6.node9 40,   node6.node10 30,  node7.node6 20
  
node7.node10 40,  node8.node9 30,   node8.node11 60,  node9.node10 30
  
node9.node11 50,  node10.node11 50 /;

set
arcs(i,j);
arcs(i,j)$capacity(i,j) =
yes
;
display
arcs;

parameter
rhs(i);
rhs(source) = -1;
rhs(sink) = 1;


variables

  x(i,j)
'flow along arcs'
  f     
'total flow'
;
positive variables x;
x.up(i,j) = capacity(i,j);


equations

   flowbal(i)
'flow balance' ;

flowbal(i)..
  
sum(arcs(j,i), x(j,i)) - sum
(arcs(i,j), x(i,j)) =e= f*rhs(i);

model m/flowbal/
;
solve m maximizing f using lp;

Tuesday, June 14, 2011

MIP Gap not decreasing

Here is a small fragment of a Cplex log:

     Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Gap


   3186  2520     -666.6217  4096      956.6330     -667.2010  1313338  169.74%
   3226  2560     -666.6205  4097      956.6330     -667.2010  1323797  169.74%
   3266  2600     -666.6201  4095      956.6330     -667.2010  1335602  169.74%
Elapsed real time = 2801.61 sec. (tree size = 77.54 MB, solutions = 2)
*  3324+ 2656                         -125.5775     -667.2010  1363079  431.31%
   3334  2668     -666.5811  4052     -125.5775     -667.2010  1370748  431.31%
   3380  2714     -666.5799  4017     -125.5775     -667.2010  1388391  431.31%
   3422  2756     -666.5791  4011     -125.5775     -667.2010  1403440  431.31%

Usually we expect the gap to decrease while the MIP solver is working its way. Here we see a different behavior. Because we go through zero the gap% is actually increasing here.

I believe the gap calculation used by Cplex is:

image

Obviously this is sensitive w.r.t. to the sign of BestInteger. Of course in practice we don’t see this behavior very often (in many models BestInteger and BestNode have the same sign).

Tuesday, June 7, 2011

Writing solutions back into a database

The issue of writing optimization results back into a database is often a little bit more complex than reading input data (see http://yetanothermathprogrammingconsultant.blogspot.com/2011/04/live-databases.html for some thoughts on reading data).

First with writing more things can go wrong: you need additional permissions, you can overwrite data, and conceptual there are some issues. Here are some questions and possibilities that come to mind:

  1. Do we have a say in how the tables look like or do we need to follow some existing data model?
  2. Update or Insert. Do we use an SQL INSERT or SQL UPDATE to populate solution tables?
  3. Some databases have SQL extensions that implement an UPSERT. E.g. MySQL has INSERT INTO … ON DUPLICATE KEY UPDATE, or REPLACE INTO, Oracle and SQL Server have MERGE.
  4. It may be better to erase the solution tables first.
  5. Or always INSERT with a new key indicating the run id. A new solution gets a new run id.
  6. For very large data some RDBMS systems have a BULK INSERT facility (SQL Server: BCP or BULK INSERT, SQL*Loader for Oracle).
  7. We can write to temporary tables and deal with the problem of merging results into the real tables further down the road.
  8. Just write a well-designed temp database (e.g. in Access) and let the IT people deal with getting the data into their systems.

In general I prefer 8, let others take responsibility for plugging things into the database. Note that this step can be expensive. A few weeks ago I was discussing a complex model where we needed to get the results into the RDBMS. To achieve a reasonable turnaround time, I was given a time limit of about an hour to come up with solutions for the model, while the system guy allocated 2 hours for himself to put the solution back into the database (much time is spent on checks).

Sunday, June 5, 2011

Remote Desktop adventure

Suddenly my remote desktop window went black. Even after reconnecting, I got again just a black screen. Took me a little while to figure out that with Ctrl-Alt-End I could send a Ctrl-Alt-Del to the remote session, so I could log out. That solved the problem: after logging on again everything was fine.