Sunday, November 15, 2009

NLP model

> 1) Is there any nonlinear programming optmizer that I can user for the
> following problem?
>
> Obj function: (max) revenue = price * volume
> Constraints: price and volume pair must be from the following variable data
set:
>
> Variable data set:
> # price volume
> 1 10 500
> 2 20 450
> 3 30 330
> 4 40 250
> 5 50 190
>
> Expected result: 10,000 (variable row#4)


You are confused. That is not an nonlinear programming problem. You can just 
run through the data set and pick the largest product.

Wednesday, November 11, 2009

Persistency in math programming models

For a model I am working on we need a user settable degree of persistency: solutions should take into account historic solutions and stay “close” to those in some sense. Basically: wildly different solutions are unwanted. An excellent paper on this subject is by the NPS people: http://faculty.nps.edu/dell/docs/optimization_and_persistence.pdf.

Monday, November 9, 2009

NEOS statistics

At http://neos.mcs.anl.gov/neos/report.html you can ask for statistics of the usage of the NEOS solvers (http://neos.mcs.anl.gov/neos/).

I ran the statistics for a year and a few interesting and unexpected patterns showed up.

The top 5 solvers:

List of Solvers
Solver Name # of Submissions
KNITRO 22498
bpmpd 20185
MINOS 15932
MINTO 14769
SNOPT 12916

is dominated by NLP solvers (knitro, minos, snopt). It may be possible that some LP’s are solved with NLP solvers (I see that sometimes also happening in client models). I had to google what bpmpd was (see http://www.sztaki.hu/~meszaros/bpmpd/), and I am surprised it ended up so high. It is nice to see good old MINOS ending up prominently in the top 5.

If we look at the top 5 input formats we see:

List of Inputs
Solver Input # of Submissions
AMPL 139782
GAMS 63873
CPLEX 3251
Fortran 3176
MPS 2940

Here we see AMPL as the winner with GAMS having half the submissions. This could well be explained by having most submissions coming from educational institutions, where AMPL is most likely more popular than GAMS as its format is closer to a mathematical formulation. It is good to see that almost all submissions are done in modeling languages: users should stay away from programming languages to build models unless there are significant reasons otherwise. Also note that Fortran seems to be more popular than C (at rank 7).

List of Categories
Solver Category # of Submissions
nco 65532
milp 49524
lp 32231
minco 28732
kestrel 15474

I assume that nco means non-linear constrained optimization. This confirms that many NLP models are solved on NEOS. I would guess minco means MINLP.

I certainly would not have predicted some of these numbers!

Wednesday, November 4, 2009

SOS variables vs binary variables

Reading the response in http://code.msdn.microsoft.com/solverfoundation/Thread/View.aspx?ThreadId=2485 I was thinking that I probably would model the constraint

(x-1)*(x-2) == 0

differently than the proposed SOS1 formulation. In general an OR can be modeled with a single binary variable (of course in this special case we can model x=1 OR x=2 as x = 1 + xb, where xb is a binary variable). The SOS1 formulation given in the thread is of course perfectly valid.

In practice I almost never use SOS1 sets. The reason for using SOS sets can be two-fold:

  • It makes the formulation easier
  • It improves performance

For SOS1 I often find a formulation with binary variables just as convenient. I don’t have hard data but I suspect that the performance gains resulting from using SOS1 variables are probably not that significant: I would suspect that solvers can use effective cuts on models with binary variable formulations. In addition, the SOS performance really benefits from good reference row weights, which I often don’t have (or in the case of GAMS cannot even specify).

For SOS2 variables (almost always in the context of a piecewise linear formulation) I find the SOS2 formulation helps in simplifying the formulation. E.g. GAMS does not have specific syntax to specify piecewise linear like AMPL, but with a SOS2 formulation, things are not that complicated. See: http://yetanothermathprogrammingconsultant.blogspot.com/2009/06/gams-piecewise-linear-functions-with.html. Also in this case it would be interesting to see if there is any computational advantage in using a specific approach: SOS2 or binary variables.

Wednesday, October 28, 2009

Excel: too many adjustable cells

> My Excel Solver model gives this error message:

image

Yes, this means Solver has hit the built-in limit of 200 variables. See: http://office.microsoft.com/en-us/excel/HP100738491033.aspx. You can upgrade to a solver from Frontline, or if you want a more structured Excel based approach consider something like Microsoft Solver Foundation. Also Cplex 12 has an Excel interface much like Solver. All the other modeling languages also have Excel interoperability.

Apparently, there are quite a few models with fewer than 200 variables, see http://www.utexas.edu/courses/lasdon/design3.htm.

Note that for some models the 200 variable limit is less restrictive than it seems as intermediate variables do not have to be included in the adjustable cells.

For larger models I am a big proponent of moving to a modeling language. I have converted a number of models written in Excel to GAMS and AMPL and actually in virtually all of them I had problems in reproducing the results: all these Excel spreadsheets contained bugs. It is just too difficult to do error-free comprehensive modeling using the Excel formula paradigm. I’d suggest to use Excel where it is really good: as data editor and for reporting. For the stuff in between: the actual modeling, nothing beats a specialized modeling language.

NLP infeasible

A client of mine sent me a large NLP (36k variables and equations) that was infeasible. When a general local NLP solver returns INFEASIBLE, this may mean two things:

  1. The model is infeasible, i.e. no feasible solution exists
  2. or the model contains some feasible solutions but the solver cannot find any

As is the case for many NLP models this model had many linear equations. So I first solved for the linear equations only using an LP solver. This was still infeasible. Now we know we have case 1. After fixing this problem (some bounds were too tight) I kept the LP in, as with little effort I could construct a feasible solution for the whole problem using the LP solution. The NLP solver I used was CONOPT which is a feasible path algorithm, so it will stay feasible when the initial point is feasible. This trick may speed up the solution procedure but also provides for better results in case the solver stops prematurely: it will give us at least a solution that is feasible.

Tuesday, October 27, 2009

Tables in Excel 2007

I started to convert some spreadsheets to Excel 2007. This will allow me to use Tables. See e.g. http://www.jkp-ads.com/Articles/Excel2007Tables.asp, http://blogs.msdn.com/excel/archive/2005/10/28/486604.aspx. Need to remember this word: “structured referencing”.

Saturday, October 24, 2009

Is there a way to convert this linear functional optimization problem into linear programming?

max a'*x / b'*x

w.r.t x,

s.t. x>=0, 1'*x=1

Here a and b are given vectors, " ' " denotes the vector
transposition, x is our search variable which is a vector, 1 is also a
vector above. It means the element of x should sum up to 1.

Any thoughts?

Thanks a lot!

Sometimes. In slightly different notation, your model looks like:

imageWe need to think about the denominator becoming zero. If we allow any b(i) this can easily happen. One possible way to make the problem well defined is to consider b(i)>0. (Alternatively all b(i)<0 will do). So let’s assume all b(i)>0.

First of all, this problem is not that difficult to solve for a general NLP solver. Good old MINOS is very quick on this even for large instances (e.g. 10k variables). The Hessian becomes big then, so solvers requiring second derivatives are in a disadvantage here. Note: there is a reformulation possible that reduces the size of the Hessian: write the objective as constraint of the form varobj*sum(i, b(i)*x(i))=sum(i,a(i)*x(i)).

However, there is a reformulation possible into an LP.  When we apply the transformation:

imagewe end up with the LP model:

image After solving we can recover the x variables through:

image For more info about this transformation see section 4.3.2 in http://www.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf or section 2.10.2 in http://www.dashoptimization.com/home/downloads/book/booka4.pdf. I simplified things a little bit by using that x(i) sums to 1. The disadvantage of this transformation for more complex models is that we really need to change the complete model.

Alternatively we may exploit the structure of the model and just use the simple algorithm:

imageSo we don’t even need linear programming here, just find the maximum of n numbers. The above is illustrated in the following GAMS model:

 

*------------------------------------------------------
* random data, b(i)>0
*------------------------------------------------------

set i /i1*i10000/;
parameter a(i), b(i);
a(i) = uniform(-10,10);
b(i) = uniform(1,10);

*------------------------------------------------------
* nonlinear model
*------------------------------------------------------

positive variable x(i) 'nlp solution';
variable obj1;
equations e_obj1, sum1;

e_obj1.. obj1 =e=
sum(i, a(i)*x(i))/sum(i, b(i)*x(i));
sum1..
sum(i, x(i)) =e= 1;

* initial point
x.l(i) = 1/
card(i);

model m1 /e_obj1,sum1/;
solve m1 maximizing obj1 using nlp;

*------------------------------------------------------
* transformed linear model
*------------------------------------------------------

positive variable y(i);
variable obj2;
equations e_obj2, sum2;

e_obj2.. obj2 =e=
sum(i, a(i)*y(i));
sum2..
sum(i, b(i)*y(i)) =e= 1;

model m2 /e_obj2,sum2/;
solve m2 maximizing obj2 using lp;

*------------------------------------------------------
* recover x
*------------------------------------------------------

parameter pz,px(i) 'recovered solution';
pz =
sum(i,y.l(i));
px(i) = y.l(i)/pz;

display x.l,px;

*------------------------------------------------------
* alternative algorithm
*------------------------------------------------------

parameter c(i); c(i) = a(i)/b(i);
scalar max; max = smax(i, c(i));
display max;
set jmax(i); jmax(i)$(c(i)=max) = yes;
abort$(card(jmax)<>1) "want a single max a(i)/b(i)";
parameter px2(i) 'alternative solution';
px2(jmax) = 1;
display px2;

Wednesday, October 21, 2009

MIQP

I am following your blog with great interest and I would be very happy to have your insight on an optimisation problem I am currently struggling with.
I am wondering if there is an easy way to solve a model with linear constraints on a vector of both continuous and binary variables and a quadratic objective function depending only on the subset of continous variables (e.g. least squares distance to an objective vector) without having to resort to a non-linear solver.
I reckon this is some kind of 'MIP + Constrained' Linear Least Squares. I would be disappointed if there was no better way to solve this than treating it as 'any' non-linear problem, but as I found nothing relevant at the moment, I'd be glad to hear what you think about it.

Some solvers (such as Cplex) offer a MIQP (Mixed Integer Quadratic Programming) functionality. That may be something to try out. Otherwise, if the objective tries to penalize some large deviations, it may be possible to achieve a similar result by minimizing the maximum deviation. This can be formulated in a linear fashion.

Yet another linear approximation can be formed by minimizing the sum of absolute deviations. For an example see http://yetanothermathprogrammingconsultant.blogspot.com/2009/07/ms-solver-foundation-excel-interface_07.html. These linear formulations essentially choose a different norm ||x||.

Informs powerpoint

 

Powerpoint from INFORMS presentation Data and Software Interoperability with GAMS at San Diego INFORMS meeting.

Basic ideas:

  1. Specialized Modeling Languages such as AMPL and GAMS are in many cases more efficient than using traditional programming languages, even when fortified with a modeling API (Ilog Concert or MS SFS). It is unfortunate that those environments are getting as much attention as they do.
  2. GAMS/GDX is a tool that really works. Some large scale examples are shown.
  3. Problems in the interface between systems (e.g between modeling language and database) are to be expected. These are the seams of an application, and here is where problems are more likely to occur. This is unfortunate because fixing these problems is more difficult: you need knowledge about both systems e.g. SQL and GAMS to debug. This is an extension of Murphy’s law: bugs appear where they are most difficult to debug and fix.
  4. SQL2GMS/MDB2GMS is less elegant than direct access through specialized syntax as in AMPL. However the big advantage is that we can repair problems on-the-fly using SQL. SQL is typically good in fixing problems where GAMS is less well-suited (e.g. string processing). In practice almost all data import requires some fixing.
  5. GAMS is one of the modeling languages that is very well equipped to do data manipulation. For some policy evaluation models this is very important. Some numerical evidence is given about such models: large models with > 95% of the code dedicated to data manipulation. GAMS excels when dealing with sparse data.
  6. New facilities in GAMS allow to call external programs and exchange data through GDX. This works but can be cumbersome. A proposal is presented to allow for direct function calls to external functions.