Tuesday, October 9, 2018

A pure fixed charge transportation problem

The standard transportation problem is an example of an easy linear programming problem:

Transportation Problem
\[\begin{align}\min & \sum_{i,j} \color{DarkBlue} c_{i,j} \color{DarkRed} x_{i,j}\\ & \sum_j \color{DarkRed} x_{i,j} = \color{DarkBlue} {\mathit{supply}}_i\\ & \sum_i \color{DarkRed} x_{i,j} = \color{DarkBlue} {\mathit{demand}}_j \\ & \color{DarkRed} x_{i,j} \ge 0\end{align}\]

We can solve very large instances very quickly. The fixed charge transportation problem is much more difficult:

Fixed Charge Transportation Problem
\[\begin{align}\min & \sum_{i,j} \color{DarkBlue} c_{i,j} \color{DarkRed} x_{i,j} + \color{DarkBlue} d_{i,j} \color{DarkRed} y_{i,j}\\ & \sum_j \color{DarkRed} x_{i,j} = \color{DarkBlue} {\mathit{supply}}_i\\ & \sum_i \color{DarkRed} x_{i,j} = \color{DarkBlue}{\mathit{demand}}_j \\ & 0 \le \color{DarkRed} x_{i,j} \le \color{DarkBlue} M \color{DarkRed} y_{i,j}\\ & \color{DarkRed} y_{i,j} \in \{0,1\}\end{align}\]

Here we have an additional cost \(d_{i,j}\) if link \(x_{i,j}\) is used (i.e. if \(x_{i,j}\gt 0\)).  The binary variable \(y_{i,j}\) indicates if link \(i \rightarrow j\) is used. I.e. we want: \[y_{i,j} = 0 \Rightarrow x_{i,j}=0\] The constraint \(x_{i,j} \le M y_{i,j}\) is a big-M formulation for this. Notice that we do not need to worry about \(x_{i,j}=0 \Rightarrow y_{i,j}=0\): this will be taken care of automatically (because of the objective function). In practice we can do \[\begin{align} & x_{i,j} \le M_{i,j} y_{i,j}\\ & M_{i,j} = \min \{ \mathit{supply}_i,  \mathit{demand}_j \}\end{align}\] This gives some reasonable values for \(M_{i,j}\). A problem with only fixed charge costs (i.e. \(c_{i,j}=0\)) is sometimes called a pure fixed charge transportation problem. If we just want to minimize the number of links that are being used we can set \(c_{i,j}=0\) and \(d_{i,j}=1\). These models can be incredibly difficult to solve to proven optimality.

Here we try to solve a very small problem with 10 source nodes and 10 destination nodes. I used random values for the supply and demand:


----     20 PARAMETER supply  

src1  0.041,    src2  0.203,    src3  0.132,    src4  0.072,    src5  0.070,    src6  0.054,    src7  0.084
src8  0.206,    src9  0.016,    src10 0.120


----     20 PARAMETER demand  

dest1  0.178,    dest2  0.103,    dest3  0.177,    dest4  0.136,    dest5  0.023,    dest6  0.114,    dest7  0.028
dest8  0.045,    dest9  0.119,    dest10 0.078


The objective is \(\min \sum_{i,j} y_{i,j}\), i.e. we minimize the number of open links. This means we have just 100 binary variables. After setting a time limit of 9999 seconds I see:

. . .
 5938307 5589691       15.7946    12       18.0000       15.0950 1.71e+08   16.14%
 5946634 5596825       15.1040    25       18.0000       15.0950 1.72e+08   16.14%
 5959435 5611506       15.1737    19       18.0000       15.0950 1.72e+08   16.14%
Elapsed time = 9977.59 sec. (3226694.74 ticks, tree = 38155.78 MB, solutions = 3)
Nodefile size = 36108.12 MB (23960.92 MB after compression)
 5965049 5616397       16.0000    19       18.0000       15.0950 1.72e+08   16.14%

Cover cuts applied:  264
Flow cuts applied:  5
Mixed integer rounding cuts applied:  331
Zero-half cuts applied:  4
Multi commodity flow cuts applied:  4

Root node processing (before b&c):
  Real time             =    0.88 sec. (41.53 ticks)
Parallel b&c, 8 threads:
  Real time             = 10021.94 sec. (3248441.83 ticks)
  Sync time (average)   = 5484.02 sec.
  Wait time (average)   =    0.11 sec.
                          ------------
Total (root+branch&cut) = 10022.81 sec. (3248483.36 ticks)
MIP status(107): time limit exceeded
Cplex Time: 10023.03sec (det. 3248483.36 ticks)

Ouch.. If you would have asked before, I certainly would have predicted somewhat better performance.

MIP bounds don't change anymore


The best objective value of 18 was found after 25 seconds.  We might as well have stopped at that point...

Most likely 18 is the optimal objective, but we have not proven that. The best solution looks like:

Best found solution

Saturday, October 6, 2018

Just your average Excel formula


Source:
Ira Iosebashvili, Wall Street Journal, The First Rule of Microsoft Excel -- Don't Tell Anyone You're Good at it, Oct. 5, 2018
https://www.wsj.com/articles/the-first-rule-of-microsoft-exceldont-tell-anyone-youre-good-at-it-1538754380

Wednesday, October 3, 2018

Scheduling and three sided football

Design a tournament schedule


In [1] a question is posed how to design a schedule for a sequence games:

  • There are 9 teams,
  • we have 4 rounds of 3 matches,
  • each match consists of 3 teams,
  • a team plays with another team exactly once in the same match.
This problem is related to the social golfer problem [2,3]. Not many games have three teams but there is something called three sided football [4].


Three sided football field [4]

Integer Quadratically Constrained Model


To find a feasible schedule we use the following sets: \[\begin{align} & t \in \{A,\dots,I\} && \text{9 teams} \\ & r \in \{\mathit{round}_1,\dots,\mathit{round}_4\} && \text{4 rounds}\\ & m \in \{\mathit{match}_1,\dots,\mathit{match}_3\} && \text{3 matches per round} \end{align}\] We introduce binary variables \[x_{r,m,t} = \begin{cases} 1 & \text{if team $t$ plays in match $m$ in round $r$} \\ 0 & \text{otherwise}\end{cases}\]
A high-level model can look like:

MIQCP feasibility model
\[\sum_m x_{r,m,t} = 1 \>\>\forall r,t\]Each team plays exactly once in each round
\[\sum_t x_{r,m,t} = 3 \>\>\forall r,m\]Each match consists of three teams
\[\sum_{r,m} x_{r,m,t} x_{r,m,t'} = 1 \>\>\forall t\lt t'\]Teams \(t\) and \(t'\) play exactly once in the same match. This is a non-convex, quadratic constraint. 

There is no objective: we are only looking for a feasible solution. The last equation is the most complicated one. We have \(x_{r,m,t} x_{r,m,t'}=1\) if teams \((t,t')\)  are in the same match \((r,m)\). Because of symmetry in the product \(x_{r,m,t} x_{r,m,t'}\) , we only need to check teams \(t\lt t'\). This saves us a few quadratic constraints.

This model can be solved with a constraint solver or a global (non-convex) MIQCP (Mixed Integer Quadratically Constrained Programming) solver. Of course, with a MIQCP solver we need to add a dummy objective. Global solvers, like Baron, Couenne or Antigone, can solve this easily. The results can look like:

Solution
We have marked the cases where \(x_{r,m,t}=1\). We can check that:

  • Every row has three entries (a match has three teams)
  • Every column has four entries: one in each round (a team plays once in each round)
  • Two teams play in the same match exactly once (e.g. A and B meet in round 1 only)

It is noted that there are many feasible solutions.

Making the problem convex, linear


If we want to solve with a convex MIQCP solver (like Cplex or Gurobi) we can replace the quadratic constraint by: \[\sum_{r,m} x_{r,m,t} x_{r,m,t'} \le 1 \>\>\forall t\lt t'\]

Finally, if we want, we can linearize the quadratic constraint. That would allow us to solve this problem with a straight linear MIP solver. We can write: \[\begin{align} & y_{r,m,t,t'} \ge x_{r,m,t} + x_{r,m,t'} - 1 && \forall r,m,t\lt t'\\ & \sum_{r,m} y_{r,m,t,t'} \le 1 && \forall t\lt t'\\ &   y_{r,m,t,t'} \in [0, 1]\end{align}\] Again, we need to do this only for \(t\lt t'\). The MIP model performs better than the quadratic versions if the problems become a bit larger. It is not unusual that linarized models, solved as a MIP, perform better than using a quadratic solver against the original model, even if we have to add a significant number of variables and equations in the process.

There is lots of symmetry in this problem and there are quite a few papers around, mainly from a constraint programming view, that try to deal with this.

Fixing


For larger problems there are a few tricks we can apply. For example, we can fix the first two rounds, and let the model only worry about the subsequent rounds. In addition we can fix the first \(m\) teams in all rounds. The idea is to make the remaining problem smaller so that the solver has an easier job. Fixing also reduces symmetry. With this I could solve very quickly the following instances [5] (I used the MIP formulation):

5 rounds, 4 matches, 16 teams (first 2 rounds, first 4 teams fixed)

6 rounds, 5 matches, 25 teams (first 2 rounds, first 5 teams fixed)

The dark areas were fixed and the light area was solved by the MIP solver.

Conclusion


With two simple tricks:

  • linearization of the quadratic constraint 
  • fixing a significant part of the problem
we can solve reasonable sized schedule design problems with a good MIP solver quickly. 


References


  1. PHP - Evenly Distribute Teams Into Arrays So That No Combination Repeats, https://stackoverflow.com/questions/52600546/php-evenly-distribute-teams-into-arrays-so-that-no-combination-repeats
  2. Markus Triska and Nysret Musliu, An Improved SAT Formulation for the Social Golfer Problem, Annals of Operations Research Vol. 194(1) (2012), pp. 427-438
  3. Nicolas Barnier, Pascal Brisset. Solving the Kirkman’s schoolgirl problem in a few seconds. CP 2002, 8th International Conference on Principles and Practice of Constraint Programming, Sep 2002, Ithaca, United States. Springer, 2470, pp 33-41, 2002.
  4. Three Sided Football, https://en.wikipedia.org/wiki/Three_sided_football
  5. All Match Combinations With Either 4 or 5 Teams Per Match When There are 16 or 25 Teams, https://math.stackexchange.com/questions/2939746/all-match-combinations-with-either-4-or-5-teams-per-match-when-there-are-16-or-2