Yet Another Math Programming Consultant: May 2011

Monday, May 30, 2011

GAMS/Gurobi and Ctrl-C

I am seeing some strange results when I interrupt a GAMS/Gurobi run with Ctrl-c (in the IDE using the Interrupt button). The GAMS link should give in that case the best integer solution found so far, basically identical to an iteration or time limit. However the solution seems wrong: many of the binary variables x(i,j) in a large model are off (they are reported as zero instead of one). I suspect this is a problem in the GAMS link (i.e. not in Gurobi itself). From what I can see the link it passing back levels of integer variables incorrectly in these circumstances (I have seen this before: integer variables should have basis status SUPERBASIC to prevent only a basis status being sent back).

Btw, for this (huge) model the Gurobi heuristics do a fantastic job finding good solutions:

     0     0 179791.544    0 6744 1511067.07 179791.544 88.1%     -   91s
H    0     0                    247836.12285 179791.544 27.5%     -   93s
H    0     0                    232489.21120 179791.544 22.7%     -   96s
     0     0 181272.276    0 7366 232489.211 181272.276 22.0%     - 397s
H    0     0                    226736.43968 181272.276 20.1%     - 399s
H    0     0                    218612.71830 181272.276 17.1%     - 404s

Just need to remember not to use the Interrupt button.

PS. Another minor issue with the link:

MIP Solution:        7259.007800    (-2147483648 iterations, 0 nodes)
Best possible:        6916.876514
Absolute gap:          342.131286
Relative gap:            0.047132

it has troubles with the iteration count (which actually is 120570, a number that should not overflow).

Thursday, May 19, 2011

Optimal Spread (2)

Is turns out we also have to distribute x(t)’s such that we follow a non-uniform parameter. In the previous section http://yetanothermathprogrammingconsultant.blogspot.com/2011/05/optimal-spread.html we used a function v(t) = 1 with cumulative function w(t) = t.

In this case we consider a generic parameter v(t), with its cumulative cousin w(t)=w(t-1)+v(t). To trace this function as closely as possible we can use:

E.g. when we use v(t)=t, then we see:

---- 77 PARAMETER result

v x

i1        1.000
i2        2.000
i3        3.000
i4        4.000
i5        5.000       1.000
i6        6.000
i7        7.000
i8        8.000
i9        9.000       1.000
i10      10.000
i11      11.000
i12      12.000       1.000
i13      13.000
i14      14.000       1.000
i15      15.000
i16      16.000       1.000
i17      17.000
i18      18.000       1.000
i19      19.000
i20      20.000       1.000

Saturday, May 14, 2011

Optimal spread

In a multi-objective MIP model I am working I have a binary variable x(t). This variable is mostly zero but now and then it is one. One of the goals is to achieve an even spread of the ones. I.e.

The first row has the ones clustered in the middle. The second one looks pretty good.

The following idea is based on how Q-Q plots are used to find distributions in statistical data (http://en.wikipedia.org/wiki/Q-Q_plot).

First we form a running sum:

The “predicted’ running sum for an evenly distributed x(t) can be expressed as tn/T where n is the total number of ones, t is the current index and T is the last t:

Now we try to keep the values s(t) as close as possible to the predicted values tn/T. I.e.:

Note that n is a variable in my model: I don’t know in advance how many x(t)’s are turned on. Indeed when I run this (with n=3) I see:

---- 42 VARIABLE x.L

t3 1.000, t7 1.000, t10 1.000

Here are the results when we compare the two solutions (click to enlarge):

Indeed this seems to confirm this approach could work, and we can add d(max)-d(min) as an extra objective to minimize.

Wednesday, May 4, 2011

Excel: Trace Dependents

I often use the “Trace Dependents” tool in Excel to quickly see which formulas use a given cell. Here is an example where I was a bit overwhelmed:

Tuesday, May 3, 2011

Large (almost) triangular spreadsheet?

A spreadsheet model without circular references can be called “triangular”. Excel can form a dependency graph and ordering such that recalculation can be done in one iteration. If there are circular references the spreadsheet needs to perform more iterations before the results converge.

I was given a very large spreadsheet implementing an agricultural country trade model. The question was how near triangular is this? With the Excel tool to find circular references, we only find a single case:

Luckily I have a tool that takes an Excel spreadsheet and parses Excel formulas and produces a GAMS representation of this. Essentially each cell forms an equation:

The whole spreadsheet can be viewed as a fixed point expression:

This is of course just a system of nonlinear equations:

When we solve this spreadsheet model as an NLP with Conopt we get some statistics:

--- 67,618 rows 68,707 columns 232,364 non-zeroes--- 776,017 nl-code 103,528 nl-non-zeroes

Iter Phase Ninf Infeasibility RGmax NSB Step InItr MX OK 0 0 7.3221660778E-07 (Input point) Pre-triangular equations: 22507 Post-triangular equations: 13897 1 0 1.2425686171E-07 (After pre-processing) 2 0 3.4325009848E-11 (After scaling)

Graphically, after reordering rows and columns, we have:

This certainly gives an indication we are not close to a triangular model and we need a simultaneous equation solver to handle this type of model.

Initial Point for NLP models

This example was introduced to illustrate the need for good initial points in NLP models (see http://amsterdamoptimization.com/pdf/snopt.pdf).

The problem is just a three variable model: find the smallest enclosing circle around n points. GAMS has a default initial point of zero which is often very bad. Only IPOPT can find an optimal solution quickly. But if we add a good starting point, all solvers find the optimal very quickly. Even the performance of IPOPT improves.

For a presentation I cleaned up the example a little bit:

Note: actually the model is easier to solve using a slightly different formulation:

(i.e. minimize the square of the radius r).