Monday, March 29, 2021

Mathematical Notation

I often tell beginners in optimization modeling that before starting to code, they should get a piece of paper and write down the mathematical model. There are several reasons for this:

  • A mathematical model is a high-abstraction-level, compact expression of the problem. It should fit on one piece of paper for all but the most complex models (even then I would say the core of the model should fit on one piece of paper). If you are unable to write this down, you are not ready yet, and you are also unlikely to succeed in coding things up.
  • Mentally, it is easier to throw away some scribbled equations and start afresh than once you have invested time and effort into code. Code is not cast in stone, but sometimes it almost is. Psychologically, throwing away code is not always easy. (I believe it was Donald Knuth who said: a good approach is: write a program, throw it away and start over).  
  • When I am working on a model, I can reproduce the mathematical model from memory at any time on a piece of paper, on a blackboard, or on a PowerPoint sheet. I often find the physical process of writing down the model equations helps. This has something to do with different parts of the brain being involved in writing vs typing in and looking at some code on a screen.
  • I try to stick to the usual conventions. Like using \(x_i\) and never something like \(j_x\). It is amazing how much these conventions are hardwired in my brain and how much more difficult things become when we stray from these habits. In principle, a name is just a name, but clearly, that is not the case here. Mathematics loses its power as a communication tool when we don't use these conventions.
  • Of course, not everyone is wired the same way or has the same skills and experience. It is to be expected that different people work best in different ways. 
But sometimes I see math, which is really more confusing than it helps me. Like [1]:

# Mathematical representation Y[X,Y] - X[X,Y] <= 0 X[X,Y] - M*Y[X,Y] <= 0

This throws me off completely. What does this even mean? 

(Using X and Y as variable and as index can not be right here. Using an integer variable as index is used in Constraint Programming -- the element constraint but that is not the purpose here. Here it almost reads like a recursion).


References


Sunday, March 28, 2021

Parallel machine scheduling I, two formulations

Scheduling problems with multiple machines are not that easy for MIP solvers. However, modern solvers are capable to solve reasonably sized problems to optimality quite efficiently. Here are some notes.

Data and solution


It is always helpful to start with an actual data set. So here we go:



----     29 PARAMETER jobdata  

         proctime     release     duedate     weights

job1            4          61          70           1
job2            9          59          78           1
job3            7                      67           1
job4            5          21          35           1
job5            5           6          17           1
job6            4                      19           1
job7            5          57          68           1
job8            9          41          59           1
job9            3                      12           1
job10           7          67          82           1
job11          10                      16           1
job12           7           9          25           1
job13          10          40          57           1
job14           9          58          78           1
job15           4                      26           1
job16           8                      13           1
job17           4                      63           1
job18           5                      66           1
job19           8                      45           1
job20           6          25          42           1
job21           5                      32           1
job22           5                      32           1
job23           4           5          21           1
job24           4                      94           1
job25           7          22          44           1
job26           9          60          81           1
job27           4                      37           1
job28           8                      21           1
job29           9          61          78           1
job30           5           4          16           1
job31           3                      28           1
job32           7                      10           1
job33           4          24          34           1
job34           9          39          55           1
job35           5                      23           1
job36           5                      25           1
job37           7          24          40           1
job38           8                      38           1
job39           8                      39           1
job40           6                      97           1
job41           6                     100           1
job42           3          33          43           1
job43           5          28          43           1
job44           3          64          80           1
job45           5          31          46           1
job46           4                      93           1
job47           8           2          20           1
job48           7          59          76           1
job49           9                      15           1
job50           5          51          62           1


----     33 SET precedence  precedence constraints: i must execute before j

             job4        job8        job9       job10       job11       job13       job16       job20       job33

job1          YES
job4                      YES
job6                                                          YES
job7                                              YES
job8                                  YES                     YES         YES
job11                                                                                 YES
job15                                                                                 YES
job18                                                                                             YES
job28                                                                                                         YES
job30                                                                                                         YES

    +       job36       job37       job44

job33         YES         YES
job40                                 YES


Notes:
  • Some jobs have a nonzero release date: they cannot start before a certain time. 
  • Some jobs also have some precedence relationships: they cannot start until some other job finishes. For instance: job4 needs to wait until job1 is completed. 
  • The weights on the jobs can be used used to make some jobs more important: they may be used in some objectives.
  • We just indicate time here by integer values. In practice, time may be more like a calendar, with days off (weekends, observed holidays). In that case, you have to decide whether to use dates directly as the time index or map them first to integers. In general, I would prefer to use the dates directly, but in your case, that may be different. 

Assume we have four identical machines at our disposal. Then a solution that minimizes the amount jobs are violating their due date (tardiness) can look like:


Tuesday, March 23, 2021

SOS1 sets: when to use them

Short answer: almost never.


Beginners in optimization are always very excited when they read about SOS sets in the documentation. This looks like a very efficient tool to use for constructs like pick (at most) 1 out of \(n\) (SOS1) or for interpolation (SOS2). However, experienced modelers use SOS sets very sparingly. Why? Well, solvers like binary variables much better than SOS sets. Binary variables yield great bounding and cuts (there has been enormous progress in this area), while SOS constructs are really just about branching. 


Illustration


Consider the assignment problem: 


Assignment problem A
\[\begin{align}\max& \sum_{i,j} \color{darkblue}c_{i,j} \cdot \color{darkred}x_{i,j} \\ & \sum_j \color{darkred}x_{i,j} \le 1 && \forall i \\ & \sum_i \color{darkred}x_{i,j} \le 1 && \forall j \\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align}\]


This MIP model can actually be solved as an LP (the variables are integer-valued automatically). A SOS1 version of this model can look like:

Assignment problem B, SOS1 version
\[\begin{align}\max& \sum_{i,j} \color{darkblue}c_{i,j} \cdot \color{darkred}x_{i,j} \\ & \color{darkred}x_{i,1},\color{darkred}x_{i,2},\dots,\color{darkred}x_{i,n}\in \mathbf{SOS1} && \forall i \\ & \color{darkred}x_{1,j},\color{darkred}x_{2,j},\dots,\color{darkred}x_{n,j}\in \mathbf{SOS1} && \forall j \\ & \color{darkred}x_{i,j} \in [0,1] \end{align}\]


Wednesday, March 17, 2021

MST + few Steiner points: convex quadratic model

In [1], I discussed a model for the Euclidean Streiner Tree problem. A non-convex integer programming model from the literature was reformulated into a MISOCP (not a version of a Japanese dish, but a Mixed-Integer Second-Order Cone Program). Together with a symmetry-breaking constraint and some high-performance solvers, this can lead to being able to solve larger models. 

One of the disadvantages of the model in [1] is that it only works with the full number of Steiner points \(n-2\) where \(n\) is the number of given (original) points. Here I want to discuss how we can use Minimum Spanning Tree (MST) models [2], as a building block for a model where we can add just a few Steiner points. One reason this is interesting: the first few Steiner points have the most impact on the objective.

We can just pick any model from [2]. Let's use the single-commodity flow model:


MST: Single-commodity flow formulation
\[\begin{align}\min\>&\color{darkred}z = \sum_{(i,j)\in \color{darkblue}A} \color{darkblue}c_{i,j}\color{darkred}x_{i,j}\\& \sum_{j|(i,j)\in \color{darkblue}A}\color{darkred}f_{i,j}=\sum_{j|(j,i)\in\color{darkblue}A}\color{darkred}f_{j,i} + \color{darkblue}b_{i} && \forall i \\ & \color{darkblue}M\cdot \color{darkred}x_{i,j} \ge \color{darkred}f_{i,j} && \forall (i,j) \in \color{darkblue}A \\ & \color{darkred}x_{i,j}\in \{0,1\}\\ & \color{darkred}f_{i,j} \in \{0,1,2,\dots,\color{darkblue}n-1\}\end{align}\]


Tuesday, March 16, 2021

Excel Never Dies



In a post by (famous economist) Paul Krugman:

I’m sometimes embarrassed at how much I use Excel. But it turns out to be a work of genius.

The online article he refers to is:

Packy McCormick, Ben Rollert, Excel Never Dieshttps://www.notboring.co/p/excel-never-dies

It is a nice read.

In my work, I see Excel intensively used by (almost) all clients I work with or talk to. One sometimes wonders: what would happen if Excel would just suddenly stop working everywhere? 

Monday, March 15, 2021

Euclidean Steiner tree problems as MISOCP

In [1] I looked at MIP models for the Minimum Spanning Tree problem. A related, but much more difficult problem is the Euclidean Steiner Tree Problem. Here we are allowed to add points to make the tree shorter. Here are two obvious examples [2]:




Friday, March 12, 2021

Monday, March 8, 2021

Minimum Spanning Trees in Math Programming Models

Algorithms for the Minimum Spanning Tree (MST) problem are readily available. However, sometimes we want to solve this problem inside a Mathematical Programming model. Usually, this is for two reasons:

  • We have some side constraints
  • Or as part of a larger model
These reasons are essentially the same (a matter of gradation). Embedding an MST inside a model is not totally trivial.


Data


I used in our models the data from [1]. This data set has distances for 42 US cities.


----    298 SET cities  

Manchester, N.H.    ,    Montpelier, Vt.     ,    Detroit, Mich.      ,    Cleveland, Ohio     ,    Charleston, W.Va.   
Louisville, Ky.     ,    Indianapolis, Ind.  ,    Chicago, Ill.       ,    Milwaukee, Wis.     ,    Minneapolis, Minn.  
Pierre, S.D.        ,    Bismarck, N.D.      ,    Helena, Mont.       ,    Seattle, Wash.      ,    Portland, Ore.      
Boise, Idaho        ,    Salt Lake City, Utah,    Carson City, Nevada ,    Los Angeles, Calif. ,    Phoenix, Ariz.      
Santa Fe, N.M.      ,    Denver, Colo.       ,    Cheyenne, Wyo.      ,    Omaha, Neb.         ,    Des Moines, Iowa    
Kansas City, Mo.    ,    Topeka, Kans.       ,    Oklahoma City, Okla.,    Dallas, Tex.        ,    Little Rock, Ark.   
Memphis, Tenn.      ,    Jackson, Miss.      ,    New Orleans, La.    ,    Birmingham, Ala.    ,    Atlanta, Ga.        
Jacksonville, Fla.  ,    Columbia, S.C.      ,    Raleigh, N.C.       ,    Richmond, Va.       ,    Washington, D.C.    
Boston, Mass.       ,    Portland, Me.       


The original data set is called dantzig42 from TSPLIB [8].

An optimal spanning tree can look like:


Monday, March 1, 2021

A difficult multi-line regression data set

 In [1], I discussed some models for a small 2 line regression problem. Here I want to dive into a more challenging problem.


Data generation process


I started with the following five lines: \[\begin{align}&f_1(x)=200-3x\\&f_2(x)=150-x\\&f_3(x)=10\\&f_4(x)=-50+2x\\&f_5(x)=-120\end{align}\] To generate points, I used \[\begin{align}&x_i \sim U(0,100)\\ & k_i \sim U\{1,5\} \\ & \epsilon_i \sim N(0,15)\\ & y_i = a_{k_i} + b_{k_i} \cdot x_i + \epsilon_i \end{align}\]

This corresponds to the following picture: