Yet Another Math Programming Consultant: December 2011

Sunday, December 25, 2011

Savings from applying MIP models

In the first chapter of “Applied Integer Programming: Modeling and Solution”, by Der-San Chen; Robert G. Batson; Yu Dang (2010), there is a table with successful mip applications taken from articles from Interfaces. See http://media.wiley.com/product_data/excerpt/67/04703730/0470373067.pdf.

Some numbers seem low: $0.25 million savings (EK:annual?) from a least-cost crew scheduling model at American Airlines in 1981, while others seem rather high: a coal transportation study by the World Bank and Chinese State Planning Commission (1995): $6.4 billion in savings (over what period is not specified).

Thursday, December 22, 2011

Bad NLP modeling

In http://groups.google.com/group/knitro/browse_thread/thread/a1785c5c6f64b62e/ we see the following constructs being used in an NLP model:

c6.. y1*(y1-1) =e= 0;
c7.. y2*(y2-1) =e= 0;
c8.. y3*(y3-1) =e= 0;

This is not really a good idea. These constraints essentially say: y1,y2 and y3 can assume the values 0 or 1, i.e. they are binary variables.

If you really want to use binary variables in an NLP, form a proper MINLP model. If this idea of using non-convex constraints to model binary variables was really so hot, then we would no longer need a MIP solver and just use an NLP solver.

It looks a bit that this poster is attempting to run before (s)he can walk. It would be much better if you have some knowledge about NLP modeling before before actually working on “stochastic global optimization”. May be the advisor or teacher should play a more constructive role here.

Notes:

in general NLP solvers don’t “understand” the functions you give them. They merely sample points, function values and derivatives as they go along. One exception is the global solver Baron: this solver really tries to work with and understand the functional form of the NL functions.
although in general it is better to model binary decisions as binary variables in an MINLP (or MIP if the rest is linear), there are some attempts to use nonlinear programming techniques to solve integer models:

W. Murray and K. M. Ng, “An Algorithm for Nonlinear Optimization Problems with Binary Variables,” Computational Optimization and Applications, Vol. 47, No. 2, 2008, pp. 257-288
Roohollah Aliakbari Shandiz, Nezam Mahdavi-Amiri, “An Exact Penalty Approach for Mixed Integer Nonlinear Programming Problems”, American Journal of Operations Research, 2011, 1, 185-189

Monday, December 19, 2011

Sequence dependent setup times: difficult MIPs

A simple single machine scheduling model with sequence dependent setup times is a good example where MIP solvers are not very good yet.

The simplest formulation for the 15 job problem we show below, is probably where we use the following definition:

x(i,j) = 1 if job i is scheduled before job j
0 otherwise

Obviously we don’t need the diagonal: x(i,i). Also if we know that x(i,j) = 1-x(j,i), so we don’t need both the lower- and upper-triangular part. A model now can look like:

$ontext

    Single Machine with Sequence Dependent Setup times.

    Data from: https://www.msu.edu/~rubin/files/research.html

$offtext

sets
   i0    'all jobs' /job0*job15/
   i(i0) 'jobs to schedule' /job1*job15/
;

alias(i,j), (i0,j0);

table data(*,*)
         job1 job2 job3 job4 job5 job6 job7 job8 job9 job10 job11 job12 job13 job14 job15
proctime   88   95   98 102 100   97 100   96 102    97   108   107   102   100    96
duedate 1162 1194 1195 1196 1229 1236 1240 1301 1323 1353 1368 1402 1421 1432 1433
    job0   12    7   15   19    2   14    7   19    9     3    20     0    17    13     4
    job1    0   16   19   10    2   13   17   20    3     9     6     3    19     9    13
    job2    1    0   10   19    3    7    9   18   20    14    10    11    17     5     7
    job3    8   17    0    7   19    6   10    3    3     6    14     4    12    20     8
    job4   18   13   10    0   18    1   14    4   16     3     6    13     2     2    15
    job5    2   14   14    3    0   10   12    6   20    12     1     7    16     6     3
    job6    4    2   10    1   16    0   20   13   17    10    15    19    12    18     6
    job7    5    5   19   18    5    2    0    8    5    15    17     0     0    11     2
    job8   13   18    4   15   19   13    3    0   18    14    12    16    15    16     5
    job9   19   17    9   11   17   20   20    3    0     2    10     4     0     8    13
   job10   13    6   20   19   15   13   12    7    5     0    11     5     5     8     7
   job11 11    9    3   10   11    0    9   19   18     4     0    14     3    19     3
   job12   14   17   18   10    0   13    0   20   19     8     4     0     1    18    15
   job13    7   20    1   10   12    0    7   13    5    13     5    11     0    20    4
   job14    7   14   17   13   19   10   11   10   15     9     5     2    10     0     5
   job15    0    5    6   18   12    8   20    0    1    17     7    18    12    18     0
;

parameters
proctime(i)
setuptime(i0,j)
duedate(i)
;
proctime(i) = data('proctime',i);
setuptime(i0,j) = data(i0,j);
duedate(i) = data('duedate',i);

scalar M;
~~M =~~ ~~sum((i0,j),setuptime(i0,j))+sum(i,proctime(i));~~
M = sum(j, smax(i0,setuptime(i0,j)) + proctime(j));

variables
   x(i,j)        'j later than i'
   starttime(i) 'start of job'
   endtime(i)    'end of job'
   makespan      'total makespan'
   z             'objective variable'
   late(i)       '~~lateness~~ tardiness of job'
;
binary variables x;
positive variables late,starttime,endtime;

equations
   calcstart1    'bound on starttime'
   calcstart2    'bound on starttime'
   calcstart3    'bound on starttime'
   ontime        'compare endtime with duedate'
   emakespan     'bound on makespan'
   calcendtime   'end=start+proc'
   obj           'objective'
;

set lt(i,j);
lt(i,j)$(ord(i)<ord(j))=yes;

calcstart1(lt(i,j)).. starttime(j) =g= endtime(i) + setuptime(i,j) - M*(1-x(i,j));
calcstart2(lt(j,i)).. starttime(j) =g= endtime(i) + setuptime(i,j) - M*x(j,i);
calcstart3(i).. starttime(i) =g= setuptime('job0',i);

calcendtime(i).. endtime(i) =e= starttime(i) + proctime(i);

ontime(i).. endtime(i) =l= duedate(i) + late(i);

emakespan(i).. makespan =g= endtime(i);

obj.. z =e= 0*makespan + 1*sum(i,late(i));

* initial solution: use mipstart to load it
x.l(lt(i,j)) = 1;

option optcr=0, reslim=3600;
model m1 /all/;
solve m1 minimizing z using mip;

display x.l,starttime.l,endtime.l;

We can find the optimal solution of 90 in about an hour or so (running on one thread):

However, as we can see we are far from proving this is indeed the optimal solution: the best bound has not moved at all.

Notes:

Constraint calcstart3 can be written as a bound. Of course these days, a MIP solver will do that for you in the pre-solve phase.
The initial solution was constructed by ordering the jobs by due date.
The Gantt chart for the machine can look like:

Here the late jobs are colored red.
The model minimizes sum of ~~lateness~~ tardiness. Additional objectives could be number of late jobs and total makespan.
More difficult to model is this: scheduled downtime. If jobs can not be interrupted, this causes holes (idle time) in the schedule.
A different, slightly more complicated formulation would result from:
x(i,j) = 1 if job i is scheduled immediately before job j
0 otherwise
IBM Ilog’s CP solver using OPL can model sequence dependent setup times directly (see: http://publib.boulder.ibm.com/infocenter/cosinfoc/v12r3/topic/ilog.odms.cpo.help/Content/Optimization/Documentation/Optimization_Studio/_pubskel/usrcpoptimizer2793.html).
Many papers suggest a meta-heuristic (genetic algorithm etc.) to solve this.
Of course a MIP model can still help to define the problem correctly and to solve small instances to optimality.
Paper for data file is: “SCHEDULING IN A SEQUENCE DEPENDENT SETUP ENVIRONMENT WITH GENETIC SEARCH”, Paul A. Rubin, Gary L. Ragatz, Computers & Operations Research, 22, no 1, jan 1995, pp 85-99.
The paper notes that according to a survey, 70% of scheduling problems may be subject to sequence dependent setup times (this number looks a little bit high to me).
A generalization of setup times that are sequence dependent is to consider them history dependent, i.e. dependent on a number of preceding jobs. See for instance http://onlinelibrary.wiley.com/doi/10.1002/nav.21472/abstract.

PS. After the comment from Paul Rubin, I updated the big-M calculation. This reduces the size of M from 3819 to 1775. Although this is a better formulation, an example run with Cplex, 1 thread actually shows somewhat poorer performance of the new M. In the picture below we have M1 as the old M=3819 and M2 the results for M=1775.

Sometimes with a MIP we see that a better formulation can have worse performance on certain instances.

PS2. See comment below. Officially we have tardiness = max(0,lateness). I hardly use the term tardiness: “normal people” tend to use the more intuitive earliness and lateness (both nonnegative).

Optimization for Dummies

Gurobi came out with a nice little booklet explaining optimization in basic non-geek terms. It may be useful to show your boss what you want him (or her) to spend some money on. If you want a copy, contact the Gurobi people (www.gurobi.com).

Actually I have purchased a few of these “for dummies” books – there must be a ton of them by now –, and in general they are quite well written.

Wednesday, December 14, 2011

Logs in NLP

In http://www.or-exchange.com/questions/4214/entropy-maximization-log0-ampl-tricks someone seems to claim that:

y=log(x),
x.lo = 0.001

is not good modeling practice. I don’t agree: the better NLP solvers don’t evaluate nonlinear functions outside their bounds. And from my more than 20 year experience in nonlinear modeling: I have used this construct all the time.

A large percentage of NLPs that fail can be fixed by looking into specifying:

better bounds
better starting point
and better scaling

The comment below talks about IPOPT. No, even IPOPT will not evaluate nl funcs outside their bounds (come on, it is an interior point code!!). See http://drops.dagstuhl.de/volltexte/2009/2089/pdf/09061.WaechterAndreas.Paper.2089.pdf . Slight complication is bound_relax_factor, but in practice that is not a problem. Maybe the poster is confusing infeasibility and interior points: a point can be inside the bounds but still be infeasible.

Apart from this remark, indeed IPOPT is an excellent solver. It is a fantastic complement to the well-known active-set solvers like CONOPT, MINOS and SNOPT. If you have large models with many superbasics: these solvers tend to have problems with that, while an interior point algorithm does not really care.

Student posts (2)

One more for my collection (http://yetanothermathprogrammingconsultant.blogspot.com/2010/09/student-mails.html):

Bonjour,

Je viens de voir votre blog sur la programmation linéaire. En fait je débute sur GLPK et je voulais
vous poser la question: comment exprimer une condition dans une contrainte?
J'ai utilisé if mais ça marche pas.

Merci d'avance.
Cordialement,

Actually I understand enough French to understand this, but nevertheless it is somewhat indicative of how little effort is being employed before sending emails around asking for help.

Analytics in Retail

The last few day I was working with some people on some reasonably complicated optimization problems in the retail industry. One of the great resources on using analytics in that environment is:

http://www.amazon.com/New-Science-Retailing-Transforming-Performance/dp/1422110575/

On my way back from Chicago there was bad weather and planes breaking down, so I had quite some time to read this. Highly recommended to get a feeling about some issues in this industry that can be tackled by some math.

During the work we found again that it really helps to try to develop some models very early in the project. Even with invented data, it allows you to really focus on the problem. As a result you will find the weak spots in your knowledge about the problem and will generate often very pointed questions. The exercise of inventing data is also a very good tool to get a better understanding.

Thursday, December 1, 2011

Modeling is more difficult than many people think…

From a Bachelors thesis:

In this simple transportation model I see at least two problems.

The capacity constraint should probably look like:

forall (o in DCs)
ctCapacity:
sum(p in Products, d in Stores)
Trans[p][o][d] <= Capacity[o];

The care needed to write correct models is often underestimated.

Matrix multiplication

Theory: http://www.scottaaronson.com/blog/?p=839. This really seems to excite the theorists.
In practice: http://yetanothermathprogrammingconsultant.blogspot.com/2010/06/sparse-matmul-revisited.html