Saturday, November 28, 2009

Exponential Smoothing

Someone mentioned a method to get an optimal value for parameter α in the exponential smoothing algorithm ( Looking at this probably very often gives α=1. On pure intuition, I suspect this has something to do with the missing constant term in the corresponding ARIMA(0,1,1) model. In GAMS we can reproduce this as:


GAMS code to reproduce



parameter sales(t) /
1931   1806
1932   1644
1933   1814
1934   1770
1935   1518
1936   1103
1937   1266
1938   1473
1939   1423
1940   1767
1941   2161
1942   2336
1943   2602
1944   2518
1945   2637
1946   2177
1947   1920
1948   1910
1949   1984
1950   1787
1951   1689
1952   1866
1953   1896
1954   1684
1955   1633
1956   1657
1957   1569
1958   1390
1959   1387
1960   1289


w.lo = 0.0;
w.up = 1.0;
w.l  = 0.8;


deff(t).. f(t) =e= sales(t)$(
ord(t)=1) +

defsse.. sse =e=

model m/all/;
solve m minimizing sse using nlp;

parameter result; 
'opt','w') = w.l; 
'opt','SSE') = sse.l;

set i/i1*i4/;
parameters wx(i) /
i1 0.8
i2 0.6
i3 0.4
i4 0.2
  w.fx = wx(i);
solve m minimizing sse using nlp;
'w') = w.l;
'SSE') = sse.l;
display result;

Indeed this gives the following result:

----     90 PARAMETER result 

              w         SSE

opt       1.000 1222283.000
i1        0.800 1405768.565
i2        0.600 1761367.108
i3        0.400 2421085.784
i4        0.200 3570106.711

Note: the initial value for w (w.l  = 0.8;) is important. If we leave that statement out, CONOPT will not be able to move w away from its default initial value of zero.

Monday, November 23, 2009

Stochastic Forecast with GAMS

A client of mine asked me to see if we could implement some stochastic forecasting scheme in GAMS and incorporate this into their model. A SAS program to calculate stochastic forecasts was provided, and it was easy to translate this into GAMS:

'used in regression' /1993*2009/
'used in forecast' /2010*2050/

parameter yield(t) /

1993    4.166666667
1994    4.454901961
1995    4.111111111
1996    4.558823529
1997    6.097637795
1998    5.182341651
1999    5.548387097
2000    5.464868701
2001    6
2002    6.326530612
2003    6.52173913
2004    7.374100719
2005    6.475409836
2006    8.035714286
2007    6.395705511
2008    7.003890813
2009    7.339368303


parameter year(t);
year(t) = 1992+

* regression

'constant term'

dummy.. sse =n= 0;
fit(tr).. yield(tr) =e= b0 + b1*year(tr);

option lp=ls;
model ols /dummy,fit/;
solve ols using lp minimizing sse;

* get standard error of regression
* aka Root MSE in SAS
scalars rmse;
execute_load 'ls.gdx',rmse=sigma;
display rmse;

* forecast

* this if you want every run to be different
* execseed = 1+gmillisec(jnow);

'static forecast'
'stochastic forecast'
statfcst(tr) = yield(tr);
statfcst(tf) = b0.l + b1.l*year(tf);

stochfcst(tr) = yield(tr);
stochfcst(tf) = b0.l + b1.l*year(tf) + rmse*normal(0,1);

display statfcst,stochfcst;

* Monte Carlo Simulation

set r 'replications' /1*10000/;
parameter allyield(r,tf);
allyield(r,tf) = b0.l + b1.l*year(tf) + rmse*normal(0,1);

'stochastic forecast'

stochfcst2(tr) = yield(tr);
stochfcst2(tf) =
sum(r, allyield(r,tf))/card(r);
display stochfcst2;

It is not very difficult to incorporate a SOLVE in the last part. We need a real loop over r for that. The code can look like:

parameter result(r,t) 'used to collect model solutions';
   yield(tf) = b0.l + b1.l*year(tf) + rmse*normal(0,1);
* do something with yield(t) in model
solve gams_model using nlp minimizing z;
* collect solutions and yields
   allyield(r,t) = yield(t);
   result(r,t) = x.l(t);
* get means
parameter myield(t),mresult(t);
myield(t) =
mresult(t) =

GAMS has some cool looking fan-charts, may be I can use this in this project!


Debug unbounded models

To test a high precision linear optimization solver,
the following problem has been presented:

    Maximize: f(X) =     Σ [2+sin(m+Pi/m)]Xm

    Constraints: b(n) = M[2+sin(n+Pi/n)]  ;  n = 1,2,..,N

    Cost Matrix:     Σ sin[m+Pi/(m+n)]Xm  <=  b(n)

    Xm  >= 0 ,   M <= N ,  Pi = 3.14...

For M = N = 256, the resulting (extended-precision) maximum
                f(X) = 0.3711087372584210920E+10

Using an AMD Athlon 64 CPU with a clock rate of 2600 MHz,
the calculation time - including extended precision data input -
is 3.0 sec.

Could someone verify this result?

You may need to do more debugging. The GAMS model with any LP solvers shows unbounded. A good way to debug is to add a bound on the objective variable so we can look at a solution:


parameter m(i); m(i)=ord(i);
parameter n(j); n(j)=ord(j);

parameter b(j);
b(j) =

positive variable x(i);
variable z;

equations obj,e(j);

obj.. z =e=
sum(i, (2+sin(m(i)+pi/m(i)))*x(i));
sum(i, sin(m(i)+pi/(n(j)+m(i)))*x(i)) =l= b(j);

model mlp /all/;
solve mlp maximizing z using lp;
display x.l;

This can show:

----     22 VARIABLE x.L 

i25 5.01776E+14

(other solvers may give a different solution). All other elements are zero. Indeed for this x(25) we have an objective coefficient greater than zero: 2 + sin(x) is always possible. The coefficients for x(25) for the constraints are all negative. This can be seen from:


You can verify by clicking on:

So indeed when we make x(25) larger the objective will improve while the constraints do not limit this move.


  • A standard text book full-tableau lp solver using extended precision will still be numerically unstable.
  • Gurobi has a “quad” option to use extended precision.

Sunday, November 22, 2009

Some GAMS/Gurobi message issues

A client of mine purchased GAMS/Gurobi, and tried his license on a small LP problem and immediately sent me an email with “help my license does not work”. The output showed:

Gurobi demo license.
Gurobi library version 2.0.1

That looks like the license was not recognized. Turns out that for small models (within the demo limits) this message is issued. This is really confusing.

And what about this message: when solving an unbounded model, the screen log mentions:

Solved in 0 iterations and 0.00 seconds
Infeasible or unbounded model
LP status(4): Model was proven to be either infeasible or unbounded.

Come on, make up your mind! For a modeler it makes a lot of difference if the model is infeasible or unbounded and when writing algorithms it is even more important to know if the sub-model was infeasible or unbounded. I would prefer the solver (by default) spends some more time (e.g. by restarting using a primal algorithm) to give a proper diagnostic message.

Even worse, the listing file mentions:

**** SOLVER STATUS     1 Normal Completion        
**** MODEL STATUS      19 Infeasible - No Solution
**** OBJECTIVE VALUE                0.0000

which is wrong: the model is unbounded! Also if the model were infeasible it would be good to report a solution with minimum infeasibility or some other hints: just “no solution” is not very helpful for a modeler.

Note: these issues are mostly not related to Gurobi per se, but rather to the GAMS link implementation.

Tuesday, November 17, 2009

Assignment Problem

In an assignment problem is formed and solved with the Hungarian method in R:


Current LP solvers are very fast: when we solve this with Gurobi we see:

Presolve time: 0.34 sec.
Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    0.0000000e+00   1.000000e+03   0.000000e+00      0s
    8329    7.4610203e+02   3.115000e+03   0.000000e+00      5s
   11696    7.5580772e+02   0.000000e+00   0.000000e+00      9s

Solved in 11696 iterations and 9.30 seconds

I.e. about three times as fast. So not just any specialized algorithm can beat an excellent general purpose LP solver at all times.

Note: for n=1000 the timings for R are:

assign2and for Gurobi:

Solved in 40291 iterations and 141.13 seconds

Monday, November 16, 2009

Modeling after collecting data

I am involved in a policy evaluation project where lots of effort has been spent on collecting data. Now that we are actually starting to run optimization models using this data, the solutions give rise to other questions. That happens a lot: the solutions of first versions of a model, do not really answer questions. They more often give rise to new questions, and show that some questions are not really well-posed or even relevant. As a result some of the data suddenly seems less important, and at the same time, some new data would be really useful to solve models that help answering new questions. The lesson is really, that it is often a good idea not to delay building the model until you have all the data. Instead try to build early versions of the model as soon as possible, in conjunction with the data collection effort. This can mean we need to “invent” data. This is a useful effort in itself: it requires to form some detailed understanding of a problem that really helps down the road. Building models can help focus the mind on what is important, what we don’t know yet, and what are relevant questions to ask. It helps to pose these issues more sharply than is otherwise possible. It is very difficult to think about all these things in an abstract way: the model helps to identify problems succinctly.

Sunday, November 15, 2009

NLP model

> 1) Is there any nonlinear programming optmizer that I can user for the
> following problem?
> Obj function: (max) revenue = price * volume
> Constraints: price and volume pair must be from the following variable data
> Variable data set:
> # price volume
> 1 10 500
> 2 20 450
> 3 30 330
> 4 40 250
> 5 50 190
> Expected result: 10,000 (variable row#4)

You are confused. That is not an nonlinear programming problem. You can just 
run through the data set and pick the largest product.

Wednesday, November 11, 2009

Persistency in math programming models

For a model I am working on we need a user settable degree of persistency: solutions should take into account historic solutions and stay “close” to those in some sense. Basically: wildly different solutions are unwanted. An excellent paper on this subject is by the NPS people:

Monday, November 9, 2009

NEOS statistics

At you can ask for statistics of the usage of the NEOS solvers (

I ran the statistics for a year and a few interesting and unexpected patterns showed up.

The top 5 solvers:

List of Solvers
Solver Name # of Submissions
KNITRO 22498
bpmpd 20185
MINOS 15932
MINTO 14769
SNOPT 12916

is dominated by NLP solvers (knitro, minos, snopt). It may be possible that some LP’s are solved with NLP solvers (I see that sometimes also happening in client models). I had to google what bpmpd was (see, and I am surprised it ended up so high. It is nice to see good old MINOS ending up prominently in the top 5.

If we look at the top 5 input formats we see:

List of Inputs
Solver Input # of Submissions
AMPL 139782
GAMS 63873
CPLEX 3251
Fortran 3176
MPS 2940

Here we see AMPL as the winner with GAMS having half the submissions. This could well be explained by having most submissions coming from educational institutions, where AMPL is most likely more popular than GAMS as its format is closer to a mathematical formulation. It is good to see that almost all submissions are done in modeling languages: users should stay away from programming languages to build models unless there are significant reasons otherwise. Also note that Fortran seems to be more popular than C (at rank 7).

List of Categories
Solver Category # of Submissions
nco 65532
milp 49524
lp 32231
minco 28732
kestrel 15474

I assume that nco means non-linear constrained optimization. This confirms that many NLP models are solved on NEOS. I would guess minco means MINLP.

I certainly would not have predicted some of these numbers!

Wednesday, November 4, 2009

SOS variables vs binary variables

Reading the response in I was thinking that I probably would model the constraint

(x-1)*(x-2) == 0

differently than the proposed SOS1 formulation. In general an OR can be modeled with a single binary variable (of course in this special case we can model x=1 OR x=2 as x = 1 + xb, where xb is a binary variable). The SOS1 formulation given in the thread is of course perfectly valid.

In practice I almost never use SOS1 sets. The reason for using SOS sets can be two-fold:

  • It makes the formulation easier
  • It improves performance

For SOS1 I often find a formulation with binary variables just as convenient. I don’t have hard data but I suspect that the performance gains resulting from using SOS1 variables are probably not that significant: I would suspect that solvers can use effective cuts on models with binary variable formulations. In addition, the SOS performance really benefits from good reference row weights, which I often don’t have (or in the case of GAMS cannot even specify).

For SOS2 variables (almost always in the context of a piecewise linear formulation) I find the SOS2 formulation helps in simplifying the formulation. E.g. GAMS does not have specific syntax to specify piecewise linear like AMPL, but with a SOS2 formulation, things are not that complicated. See: Also in this case it would be interesting to see if there is any computational advantage in using a specific approach: SOS2 or binary variables.