Sunday, June 30, 2019

Maximum Dispersion


Problem statement


Given \(n\) points with their distances  \(d_{i,j}\), select \(k\) points such that the sum of the distances between the selected points is maximized [1].

A simple MIQP (Mixed-Integer Quadratic Programming) model is:


Non-convex MIQP Model
\[\begin{align} \max & \sum_{i \lt j} \color{darkblue} d_{i,j} \color{darkred}x_{i} \color{darkred}x_{j} \\ & \sum_i \color{darkred} x_{i} = \color{darkblue} k \\ & \color{darkred}x_{i} \in \{0,1\} \end{align}\]

Notice we only need to look at distances \(d_{i,j}\) with \(i\lt j\) as we can assume symmetry. If not, just make \(d_{i,j}\) symmetric by: \[d_{i,j} = \frac{d_{i,j} + d_{j,i}}{2}\]

As we shall see this is not such an easy problem to solve to optimality.

This problem is called the \(p\)-dispersion-sum problem.

Example data


I generated random \(n=50\) \(2d\) points:


----     24 PARAMETER coord  random coordinates

              x           y

i1        1.717       8.433
i2        5.504       3.011
i3        2.922       2.241
i4        3.498       8.563
i5        0.671       5.002
i6        9.981       5.787
i7        9.911       7.623
i8        1.307       6.397
i9        1.595       2.501
i10       6.689       4.354
i11       3.597       3.514
i12       1.315       1.501
i13       5.891       8.309
i14       2.308       6.657
i15       7.759       3.037
i16       1.105       5.024
i17       1.602       8.725
i18       2.651       2.858
i19       5.940       7.227
i20       6.282       4.638
i21       4.133       1.177
i22       3.142       0.466
i23       3.386       1.821
i24       6.457       5.607
i25       7.700       2.978
i26       6.611       7.558
i27       6.274       2.839
i28       0.864       1.025
i29       6.413       5.453
i30       0.315       7.924
i31       0.728       1.757
i32       5.256       7.502
i33       1.781       0.341
i34       5.851       6.212
i35       3.894       3.587
i36       2.430       2.464
i37       1.305       9.334
i38       3.799       7.834
i39       3.000       1.255
i40       7.489       0.692
i41       2.020       0.051
i42       2.696       4.999
i43       1.513       1.742
i44       3.306       3.169
i45       3.221       9.640
i46       9.936       3.699
i47       3.729       7.720
i48       3.967       9.131
i49       1.196       7.355
i50       0.554       5.763


Let's try to find \(k=10\) points that are most dispersed. The distance matrix is formed by calculating Euclidean distances.

10 out of 50 most dispersed points


Solve non-convex problem


The above MIQP problem is non-convex. We can solve the non-convex MIQP problem in different ways:

  • Throw this into a global solver such as Baron, Couenne or Antigone
  • Use a solver like Cplex (option qtolin) or Gurobi (option preQlinearize) that can linearize the problem automatically for us. See further down how we can do this ourselves.
  • Instruct Cplex to use a QP formulation (option qtolin=0) and tell it to use the global QP solver (option optimalitytarget=3)

Convexification of the quadratic model


There is a trick to make this problem convex. For maximization, we require that the \(Q\) matrix is negative definite.  In our case the \(Q\) matrix is our distance matrix. The following algorithm will make the problem convex:

  1. Calculate the largest eigenvalue \(\lambda_{max}\) of the distance matrix \(0.5 D\). We multiply by 0.5 because we only use half of the matrix in the objective to prevent double counting.
  2. If  \(\lambda_{max} \lt 0 \): we are done (matrix is negative-definite)
  3. Form the objective: \[\max \sum_{i \lt j}  d_{i,j} x_{i} x_{j} - \lambda_{max} \sum_i x_i^2 + \lambda_{max} \sum_i x_i\] 
This essentially imposes a possible large diagonal perturbation of the \(Q\) matrix. We compensate with a linear term. This uses the fact that \(x_i = x_i^2\) when \(x_i \in \{0,1\}\).  

Linearization 


The quadratic model is easily linearized by observing that a well-known trick to handle \(y_{i,j} = x_i x_j\) is to write: \[\begin{align} & y_{i,j} \le x_i \\ & y_{i,j} \le x_j \\ & y_{i,j} \ge x_i +x_j-1\\& y_{i,j}, x_i \in \{0,1\}\end{align}\] A complete model can look like:


Linearized Model 1
\[\begin{align} \max & \sum_{i \lt j} \color{darkblue} d_{i,j} \color{darkred}y_{i,j} \\ & \sum_i \color{darkred} x_{i} = \color{darkblue} k \\ & \color{darkred}y_{i,j} \le \color{darkred} x_i && \forall i\lt j \\ & \color{darkred}y_{i,j} \le \color{darkred} x_j && \forall i\lt j \\ & \color{darkred}x_{i} \in \{0,1\} \\ & \color{darkred}y_{i,j} \in [0,1] \end{align}\]

There are two things that may need some attention:

  • The inequality \(y_{i,j}\ge x_i +x_j -1\) was dropped. The objective will take care of this.
  • The variables \(y_{i,j}\) were relaxed to be continuous between 0 and 1. The \(y\) variables will be automatically integer (well, where it matters). Often models with fewer integer variables are easier to solve. Modern solvers, however, may reintroduce binary variables for this particular model. In Cplex you can see a message like shown below.


Tried aggregator 1 time.
MIP Presolve eliminated 1 rows and 1 columns.
Reduced MIP has 2451 rows, 1275 columns, and 4950 nonzeros.
Reduced MIP has 50 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.06 sec. (2.79 ticks)
Found incumbent of value 0.000000 after 0.06 sec. (4.54 ticks)
Probing time = 0.02 sec. (0.51 ticks)
Tried aggregator 1 time.
Reduced MIP has 2451 rows, 1275 columns, and 4950 nonzeros.
Reduced MIP has 1275 binaries, 0 generals, 0 SOSs, and 0 indicators.


A tighter linearization


In [2] a tighter formulation is proposed:


Linearized Model 2
\[\begin{align} \max & \sum_{i \lt j} \color{darkblue} d_{i,j} \color{darkred}y_{i,j} \\ & \sum_i \color{darkred} x_{i} = \color{darkblue} k \\ & \color{darkred}y_{i,j} \le \color{darkred} x_i && \forall i\lt j \\ & \color{darkred}y_{i,j} \le \color{darkred} x_j && \forall i\lt j \\ & \color{darkred}y_{i,j} \ge \color{darkred} x_i +\color{darkred} x_j -1 && \forall i\lt j \\ &  \sum_{i\lt j} \color{darkred}y_{i,j} + \sum_{i\gt j} \color{darkred}y_{j,i} = (\color{darkblue} k-1) \color{darkred}x_j && \forall j \\ & \color{darkred}x_{i} \in \{0,1\} \\ & \color{darkred}y_{i,j} \in \{0,1\} \end{align}\]

Essentially, we multiplied the constraint \(\sum_i x_{i} = k\) by \(x_j\), and added these as constraints. The derivation is as follows: \[\begin{align} & \left( \sum_i x_i\right) x_j = k x_j\\ \Rightarrow & \sum_{i\lt j} x_i x_j +  x_j^2 + \sum_{i\gt j} x_i x_j =  k x_j\\ \Rightarrow & \sum_{i\lt j} y_{i,j} + \sum_{i\gt j} y_{j,i} =  (k-1) x_j \end{align}\] We used here that \(x_j^2 = x_j\).

My intuition is as follows. If \(x_j=1\) then exactly \(k-1\) other \(x_i\)'s should be 1. That means \(k-1\) \(y_{i,j}\)'s should be 1. As we only use the upper-triangular part, the picture becomes:

Picture of y layout

Given that \(x_j=1\), we need to have \(k-1\) ones in the orange region. It is always a good idea to see if the constraint we formed by pure algebraic steps, has still a meaning. 


Aggregation


We can aggregate the previous cut, which leads to: \[\sum_{i\lt j} y_{i,j} = \frac{k(k-1)}{2}\] The advantage of this version is that we only need one extra constraint [4]:


Linearized Model 3
\[\begin{align} \max & \sum_{i \lt j} \color{darkblue} d_{i,j} \color{darkred}y_{i,j} \\ & \sum_i \color{darkred} x_{i} = \color{darkblue} k \\ & \color{darkred}y_{i,j} \le \color{darkred} x_i && \forall i\lt j \\ & \color{darkred}y_{i,j} \le \color{darkred} x_j && \forall i\lt j \\ & \color{darkred}y_{i,j} \ge \color{darkred} x_i +\color{darkred} x_j -1 && \forall i\lt j \\ &  \sum_{i\lt j} \color{darkred}y_{i,j} = \frac{ \color{darkblue}k (\color{darkblue} k-1)}{2} \\ & \color{darkred}x_{i} \in \{0,1\} \\ & \color{darkred}y_{i,j} \in \{0,1\} \end{align}\]

The intuition is easy: if we have \(k\) \(x_j\)'s equal to one, we need to have \(k(k-1)/2\) \(y_{i,j}\)'s equal to one.

Numerical results


I tried these formulations using Cplex with a time limit of 1800 seconds (half an hour).


ModelTimeObjGapNotes
Non-convex MIQP1800332.139387.64%Solve as quadratic model.
options: qtolin=0, optimalitytarget=3
Non-convex MIQP1800332.912957.65%Automatically converted to linear model.
options: qtolin=1
Convex MIQP 1800332.912992.22%option: qtolin=0
MIP 1 1800332.912957.65%Defaults
MIP 2 8332.9129optimalDefaults (roughly same performance whether or not including \(y_{i,j}\ge x_i+x_j-1\))
MIP 3 400332.9129optimalDefaults. Excludes \(y_{i,j}\ge x_i+x_j-1\).
MIP 3 11332.9129optimalDefaults. Includes \(y_{i,j}\ge x_i+x_j-1\).


Just one observation, but I think these results are representative for other (small) instances.

We dropped the constraint \(y_{i,j}\ge x_i+x_j-1\) from MIP model 1: the objective will push \(y\) upwards on its own. For models MIP 2 and 3 it is wise to reintroduce them.

It is noted that the gaps are terrible for all models without extra cuts. However, some of these methods find the best solution very quickly. They are just not able to prove optimality in a reasonable time. Here is an example:


        Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Gap

      0     0     1235.6817    50                   1235.6817        7         
*     0+    0                          332.9129     1235.6817           271.17%
Found incumbent of value 332.912894 after 0.02 sec. (4.67 ticks)
      0     2     1235.6817    50      332.9129     1162.0829        7  249.07%
Elapsed time = 0.03 sec. (11.72 ticks, tree = 0.02 MB, solutions = 1)


Better dispersion


From the picture of the solution (earlier in this post), we see that this model does not prevent selected points to be close to each other.  A better model may be to maximize the minimum distance between selected points. This can be modeled as:


Maximize Minimum Distance
\[\begin{align} \max\> & \color{darkred} {\Delta} \\ & \color{darkred} \Delta \le \color{darkblue} d_{i,j} + \color{darkblue} M (1- \color{darkred}x_{i} \color{darkred}x_{j})  && \forall i\lt j \\ & \sum_i \color{darkred} x_{i} = \color{darkblue} k \\ & \color{darkred}x_{i} \in \{0,1\} \\ \end{align}\]


Here \(M\) is a large enough constant, e.g. \[M = \max_{i\le j} d_{i,j}\] The model can be easily linearized by formulating the distance constraint as  \[ \Delta \le  d_{i,j} +  M (1- x_{i}) + M(1- x_{j})  \] This model quickly solves our 10 out of 50 problem. It gives as solution:

Maximization of minimum distance

This model has as disadvantage that it does not care about points being closer to each other as long as they are further away than the minimum. For this example, visual inspection seems to indicate this model does good job.


Conclusion


The \(p\)-dispersion-sum problem is very difficult to solve to optimality, even for very small data sets. Extra cuts can enormously help the linearized version of the model. A main drawback is that the optimal solutions do not provide us with well-dispersed points (points can be very close).

The maximin model solves much easier, and it gives better dispersion: selected points are never close to each other.

References


  1. How to select n objects from a set of N objects, maximizing the sum of pairwise distances between them, https://stackoverflow.com/questions/56761115/how-to-select-n-objects-from-a-set-of-n-objects-maximizing-the-sum-of-pairwise
  2. Warren P. Adams and Hanif D. Sherali, A Tight Linearization and an Algorithm for Zero-One Quadratic Programming Problems, Management Science, Vol. 32, No. 10 (Oct., 1986), pp. 1274-1290
  3. Michael J. Kuby, Programming Models for Facility Dispersion: The p‐Dispersion and Maxisum Dispersion Problems, Geographical Analysis, vol. 19, pp.315-329, 1987.
  4. Ed Klotz, Performance Tuning for Cplex’s Spatial Branch-and-Bound Solver for Global Nonconvex (Mixed Integer) Quadratic Programs, http://orwe-conference.mines.edu/files/IOS2018SpatialPerfTuning.pdf
  5. Ed Klotz, Specialized Strategies for Products of Binary Variables
    https://www.gurobi.com/wp-content/uploads/2021/03/Models-with-Products-of-Binary-Variables.pdf See also the webinar at: https://www.gurobi.com/resource/models-with-products-of-binary-variables/

Friday, June 21, 2019

Special assignment problem


Problem Description


This is adapted from [1].

Assume we have data \(a_i\) and \(b_j\):


----     11 PARAMETER a  

i1  0.172,    i2  0.843,    i3  0.550,    i4  0.301,    i5  0.292,    i6  0.224,    i7  0.350,    i8  0.856,    i9  0.067
i10 0.500


----     11 PARAMETER b  

j1  0.998,    j2  0.621,    j3  0.992,    j4  0.786,    j5  0.218,    j6  0.676,    j7  0.244,    j8  0.325,    j9  0.702
j10 0.492,    j11 0.424,    j12 0.416,    j13 0.218,    j14 0.235,    j15 0.630


Try to find an assignment of each \(a_i\) to a unique \(b_j\) such that the quantities \[ \frac{a_i}{b_j}\] are as close to each other as possible.

This is more or less an assignment problem. The objective to make all ratios \(a_i/b_j\) as equal as possible, makes it interesting. I'll discuss some formulations for this problem.

A first model


We first introduce assignment variables: \[x_{i,j} = \begin{cases} 1 & \text{if $a_i$ is assigned to $b_j$} \\ 0 & \text{otherwise}\end{cases}\] The standard assignment constraints apply \[\begin{align}&\sum_i x_{i,j}\le 1&&\forall j \\&\sum_j x_{i,j} = 1&&\forall i  \end{align}\] This is an "unbalanced" assignment: we have more \(j\)'s than \(i\)'s.

To model "all about equal", we introduce a continuous variable \(c\) that represents the common value. I.e. \[\frac{a_i}{b_j} \approx c \>\>\text{for all  $(i,j)$ such that $x_{i,j}=1$}\]

The objective is \[\min \sum_{i,j}  \left[ x_{i,j} \left( \frac{a_i}{b_j} - c\right) \right]^2 \] or  \[\min \sum_{i,j} x_{i,j} \left( \frac{a_i}{b_j} - c\right)^2 \]

A complete model can look like:

MINLP Model
\[\begin{align} \min & \sum_{i,j} \color{darkred}x_{i,j} \left( \frac{\color{darkblue}a_i}{\color{darkblue}b_j} - \color{darkred}c\right)^2 \\ & \sum_j \color{darkred} x_{i,j} = 1 &&\forall i\\ & \sum_i \color{darkred} x_{i,j} \le 1 &&\forall j\\ & \color{darkred}x_{i,j} \in \{0,1\}\\ & \color{darkred}c \text{ free} \end{align}\]


When we run this model, using an MINLP solver, we see:


----     33 VARIABLE x.L  assignment

             j1          j3          j4          j5          j7          j8          j9         j10         j11         j12

i1                                                        1.000
i2                    1.000
i3                                1.000
i4                                                                                                        1.000
i5                                                                                                                    1.000
i6                                                                    1.000
i7                                                                                            1.000
i8        1.000
i9                                            1.000
i10                                                                               1.000


----     33 VARIABLE c.L                   =        0.695  common value
            VARIABLE z.L                   =        0.201  objective


The assignment can be depicted as follows.

Solution. Matrix values are a(i)/b(j).
In the above matrix the cell values are \[\text{cell}_{i,j} = \frac{a_i}{b_j}\]

A quadratic model


We can reformulate this MINLP model into a quadratic model by introducing variables  \[y_{i,j} = x_{i,j} c\] This non-linear constraint can be linearized by stating \[\begin{align} & x_{i,j} = 0 \Rightarrow y_{i,j} = 0\\ &x_{i,j} = 1 \Rightarrow y_{i,j} = c\end{align}\] With these \(y\) variables we can form an objective: \[\sum_{i,j}  \left( x_{i,j}  \frac{a_i}{b_j} - y_{i,j} \right)^2  \] Unfortunately, this is a non-convex objective. There are not many non-convex MIQP solvers around (Cplex has one). This model gives the same solutions as the MINLP model above. I am not showing the results here, as they are identical to the ones in the previous section.


Non-convex MIQP Model
\[\begin{align} \min & \sum_{i,j} \left( \color{darkred}x_{i,j} \frac{\color{darkblue}a_i}{\color{darkblue}b_j} - \color{darkred}y_{i,j}\right)^2 \\ & \sum_j \color{darkred} x_{i,j} = 1 &&\forall i\\ & \sum_i \color{darkred} x_{i,j} \le 1 &&\forall j\\ & \color{darkred} x_{i,j} = 0 \Rightarrow \color{darkred}y_{i,j} = 0 &&\forall i,j\\ & \color{darkred} x_{i,j} = 1 \Rightarrow \color{darkred}y_{i,j} = c &&\forall i,j\\ & \color{darkred}x_{i,j} \in \{0,1\}\\ & \color{darkred}y_{i,j} \ge 0 \\ & \color{darkred}c \ge 0 \end{align}\]

We assumed here positive fractions \(a_i/b_j \gt 0\).This allows us to say \(c\ge 0\) and \(y_{i,j}\ge 0\).

A linear model


We can approximate the quadratic weighting of the residuals by absolute values: \[\sum_{i,j}  \left| x_{i,j} \left( \frac{a_i}{b_j} - c \right) \right| \] The first thing to is to linearize the term \(x_{i,j} c\). As in the quadratic model, we introduce variables \[y_{i,j} = x_{i,j} c\] We can form the implications: \[\begin{align} & x_{i,j} = 0 \Rightarrow y_{i,j} = 0\\ &x_{i,j} = 1 \Rightarrow y_{i,j} = c\end{align}\] Most solvers support indicator constraints, which allows us to implement these implications as is. If we have a solver without this, we can formulate this as a bunch of big-\(M\) constraints: \[\begin{align} & - M x_{i,j} \le y_{i,j} \le M x_{i,j} \\ & c-M(1-x_{i,j}) \le y_{i,j} \le c+M(1-x_{i,j})\end{align}\] From the data we can assume \(a_{i}\ge 0\) and \(b_{j}\gt 0\). We can exploit this: \[\begin{align} & 0 \le y_{i,j} \le M_1 x_{i,j} \\ & c-M_2(1-x_{i,j}) \le y_{i,j} \le c+M_2(1-x_{i,j})\end{align}\] We can think a bit about good values for \(M_1\) and \(M_2\). I suggest: \[M_1 = M_2 = \max c = \max_{i,j} \frac{a_i}{b_j}\]

The complete model can look like

MIP Model
\[\begin{align}
\min & \sum_{i,j} \color{darkred}r_{i,j} \\
        & \sum_j \color{darkred} x_{i,j} = 1 && \forall i\\
        & \sum_i \color{darkred} x_{i,j} \le 1 && \forall j\\
        &  -\color{darkred}r_{i,j} \le \color{darkred} x_{i,j} \frac{\color{darkblue} a_i}{\color{darkblue} b_j} - \color{darkred} y_{i,j} \le \color{darkred}r_{i,j} &&\forall i,j\\
       &  \color{darkred} y_{i,j} \le \color{darkblue} M_1 \color{darkred} x_{i,j}  &&\forall i,j\\ & \color{darkred} c - \color{darkblue} M_2 (1- \color{darkred} x_{i,j}) \le \color{darkred} y_{i,j} \le \color{darkred} c + \color{darkblue} M_2 (1- \color{darkred} x_{i,j})&&\forall i,j \\ & \color{darkred}x_{i,j} \in \{0,1\}\\ & \color{darkred}c \ge 0 \\ & \color{darkred}y_{i,j} \ge 0\\ & \color{darkred}r_{i,j} \ge 0\\ \end{align}\]


The results look like:


----     72 VARIABLE x.L  assignment

             j1          j3          j4          j5          j7          j8          j9         j10         j11         j12

i1                                                        1.000
i2                    1.000
i3                                1.000
i4                                                                                                        1.000
i5                                                                                                                    1.000
i6                                                                    1.000
i7                                                                                            1.000
i8        1.000
i9                                            1.000
i10                                                                               1.000


----     72 VARIABLE c.L                   =        0.711  common value

----     72 VARIABLE y.L  c*x(i,j)

             j1          j3          j4          j5          j7          j8          j9         j10         j11         j12

i1                                                        0.711
i2                    0.711
i3                                0.711
i4                                                                                                        0.711
i5                                                                                                                    0.711
i6                                                                    0.711
i7                                                                                            0.711
i8        0.711
i9                                            0.711
i10                                                                               0.711


----     72 VARIABLE r.L  abs(residuals)

             j1          j3          j4          j5          j7          j8          j9         j10         j12

i1                                                        0.006
i2                    0.139
i3                                0.010
i5                                                                                                        0.009
i6                                                                    0.021
i7                                                                                      6.137042E-4
i8        0.147
i9                                            0.402
i10                                                                               0.002

The model results are very close. The assignments are the same. Just the \(c\) value and resulting sum of squared or absolute residuals are somewhat different.


----     77 PARAMETER sumdev  sum of squared/absolute deviations

                  dev^2       |dev|           c

minlp model       0.201       0.784       0.695
mip model         0.204       0.737       0.711


A different approach


We can look at the problem in a different way. Instead of minimize the spread around a central value \(c\), we can minimize the bandwidth directly. I.e. we can write: \[\begin{align} \min\> & \mathit{maxv} - \mathit{minv} \\ & \mathit{minv} \le \frac{a_i}{b_j} && \forall x_{i,j}=1\\& \mathit{maxv} \ge \frac{a_i}{b_j} && \forall x_{i,j}=1 \end{align}\] These constraints can be implemented as indicator constraints \[\begin{align} & x_{i,j}=1 \Rightarrow \mathit{minv} \le \frac{a_i}{b_j}\\ & x_{i,j}=1 \Rightarrow \mathit{maxv} \ge \frac{a_i}{b_j}\end{align}\] If we don't have indicator constraints available we need to resort to big-\(M\) constraints:

Alternative MIP Model
\[\begin{align} \min\> & \color{darkred}{\mathit{maxv}} - \color{darkred}{\mathit{minv}} \\ & \sum_j \color{darkred} x_{i,j} = 1 && \forall i \\ & \sum_i \color{darkred} x_{i,j} \le 1 && \forall j \\ & \color{darkred}{\mathit{minv}} \le \frac{\color{darkblue}a_i}{\color{darkblue}b_j} + M (1-\color{darkred} x_{i,j}) && \forall i,j \\ & \color{darkred}{\mathit{maxv}} \ge \frac{\color{darkblue}a_i}{\color{darkblue}b_j} - M (1-\color{darkred} x_{i,j}) && \forall i,j \\ & \color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{maxv}},\color{darkred}{\mathit{minv}} \text{ free} \end{align}\]


The solution looks like:


----    103 VARIABLE x.L  assignment

             j1          j2          j3          j4          j5          j6          j9         j11         j14         j15

i1                                                                                                        1.000
i2                                1.000
i3                                                                                1.000
i4                                                                                                                    1.000
i5                                                                    1.000
i6                                                                                            1.000
i7                    1.000
i8        1.000
i9                                                        1.000
i10                                           1.000


----    103 VARIABLE minv.L                =        0.308  minimum value
            VARIABLE maxv.L                =        0.858  maximum value


A picture of this solution:

Minimize bandwidth solution

The disadvantage of this method is that it does not care about assignments as long as they are inside our optimal bandwidth. We can see this if we recalculate an optimal central \(c\) value afterwards. The results with this reconstructed \(c\) value:



----    114 PARAMETER sumdev  sum of squared/absolute deviations

                  dev^2       |dev|           c

minlp model       0.201       0.784       0.695
mip model         0.204       0.737       0.711
mip model 2       0.313       1.548       0.617

Indeed, we see that the total sum of squared or absolute deviations is not as good as we saw before. The same thing can be seen by just calculating the standard deviation for the selected ratios \(a_i/b_j\). This gives:


----    130 PARAMETER stdev  

minlp model 0.149,    mip model   0.149,    mip model 2 0.186

Again we see this last approach leads to more variability.

Distribution of selected a(i)/b(j)

In this above picture we see the results of this second MIP model not having an incentive to keep values close to each other, besides that they should be within \([\mathit{minv},\mathit{maxv}]\). This model is simpler than the earlier MIP model, but we pay a price: the solution is not as highly concentrated around the central point.

Non-unique \(b_j\)


If we allow the \(b_j\) to be reused, only a small change in the models is needed. We just need to drop the constraint \[\sum_i x_{i,j} \le 1 \] When we do this, we may see a solution like:

Solution. Allow a column to be selected multiple times.

Conclusion


This is turned out to be an interesting problem. A high-level MINLP problem is presented first. We can linearize the model into a MIP, but this becomes a bit messy. Modern MIP solvers offer tools (indicator constraints, built-in absolute value function) that can help here. Finally an alternative, simpler MIP model is proposed. However, this model produces solutions that are more dispersed. 

References


Thursday, June 20, 2019

Assignment: Scipy vs Cplex


Scipy.optimize.linear_sum_assignment

I have never used this routine [1], so lets give this a try. This is a Python implementation of the Hungarian method [2]. The call is exceedingly simple: just give it the cost matrix.

The algorithm allows unbalanced assignment problems by just specifying a rectangular cost matrix. Typically, unbalanced problems are transformed into balanced ones by adding source or destination nodes.

I believe the description is slightly inaccurate:

Description from Scipy.org

This would allow zero assignments, i.e. \(x_{i,j}=0\). It is not always easy to create a watertight English description.

Let's try a problem.

1000x1000 dense assignment problem

When we feed a random problem with 1,000 sources and 1,000 destinations, we see the following timings:



We first read in the data (organized as lists). The second step is to convert this to a NumPy array (otherwise this is done automatically by linear_sum_assignment). Then we solve the problem. On my laptop this takes about 325 seconds.

We can formulate this problem as an LP:\[\begin{align}\min \> & \sum_{i,j} c_{i,j} x_{i,j}\\ & \sum_j x_{i,j} = 1 && \forall i \\ & \sum_i x_{i,j} =1 && \forall j \\ & x_{i,j} \ge 0 \end{align}\]

When we feed this into Cplex we see:

Parallel mode: deterministic, using up to 4 threads for concurrent optimization.
Tried aggregator 1 time.
LP Presolve eliminated 2 rows and 1 columns.
Reduced LP has 1999 rows, 1000000 columns, and 1999000 nonzeros.
Presolve time = 5.95 sec. (770.92 ticks)
Initializing dual steep norms . . .

Iteration log . . .
Iteration:     1   Dual objective     =             1.007485
Iteration:  1113   Dual objective     =             1.661162

Dual simplex solved model.

LP status(1): optimal
Cplex Time: 14.91sec (det. 2309.10 ticks)

Cplex tries several LP algorithms in parallel (concurrent optimization). It turns out Dual Simplex wins. The total solution time is about 15 seconds. So Cplex is 20 times as fast.

We can actually improve on this a bit. This is a network problem, so we can use the Network solver. In addition, we disable the presolver (it is somewhat expensive and does little for this problem).

Extracted network with 2002 nodes and 1000001 arcs.
Extraction time = 0.63 sec. (49.15 ticks)

Iteration log . . .
Iteration:     0   Infeasibility     =          2000.000000 (2000)
Iteration: 10000   Infeasibility     =             0.000000 (7.62808)
Iteration: 20000   Objective         =             5.108089
Iteration: 30000   Objective         =             3.842909
Iteration: 40000   Objective         =             3.147366
Iteration: 50000   Objective         =             2.622157
Iteration: 60000   Objective         =             2.275971
Iteration: 70000   Objective         =             1.990956
Iteration: 80000   Objective         =             1.790689
Elapsed time = 1.33 sec. (82.13 ticks)
Iteration: 90000   Objective         =             1.684441

Network - Optimal:  Objective =    1.6701729950e+00
Network time = 1.67 sec. (119.82 ticks)  Iterations = 92769 (10000)
LP status(1): optimal
Cplex Time: 2.66sec (det. 236.01 ticks)

Now we need only less than 3 seconds (Cplex is 120 times as fast as Scipy).

We would normally expect a specialized assignment solver to do better than a general purpose LP solver. However, Cplex is a highly polished product and we run on the bare metal. Using Python and a simpler implementation just causes us to lose. Note that there are specialized solvers that are much faster.

Some solvers in Scipy are a little bit underwhelming. Examples are the linear programming simplex solver and, as we saw here, the assignment problem solver.

References


  1. scipy.optimize.linear_sum_assignment, https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linear_sum_assignment.html
  2. Harold W. Kuhn. The Hungarian Method for the assignment problem. Naval Research Logistics Quarterly, 2:83-97, 1955.

Wednesday, June 12, 2019

Machine Scheduling

Here [1] is a problem with multiple machines:
  • There are 3 machines and 4 jobs in this example
  • Not all machines can do all jobs
  • We have a few precedence constraints
  • Given known processing times, find a schedule that minimizes the makespan (the time the last job is finished)
Assume the following data:


----     63 --------------- data ---

----     63 SET j  jobs

job1,    job2,    job3,    job4


----     63 SET m  machines

machine1,    machine2,    machine3


----     63 SET ok  allowed job/machine combinations

        machine1    machine2    machine3

job1         YES         YES
job2         YES         YES
job3         YES                     YES
job4                                 YES


----     63 PARAMETER proctime  processing time

job1  4.000,    job2  2.000,    job3 10.000,    job4 12.000


----     63 SET prec  precedence constraints

            job2        job4

job1         YES         YES

There are many ways to model this. Some approaches can be:

  • Discrete time using time slots. This leads to binary variables \[x_{j,m,t} = \begin{cases} 1 & \text{if job $j$ executes on machine $m$ at time $t$}\\ 0 & \text{otherwise}\end{cases}\]
  • Continuous time with a binary variable indicating if job \(j_1\) is executed before job \(j_2\) (on the same machine)
  • Continuous time with a binary variable indicating if job \(j_1\) immediately precedes job \(j_2\) (on the same machine)  
Let's try the second approach. First we need a planning horizon. A simplistic way to find a planning horizon \(T\) is just to schedule each job after another: \[T = \sum_j \mathit{proctime}_j \] For our small data set, this gives:


----     63 PARAMETER T                    =       28.000  max time

This \(T\) is used as a big-\(M\) constant, so we would like it to be as small as possible. For very large problems, we often use some heuristic to find an initial solution, and use this to set the planning horizon \(T\). If that is not possible, we may want to use indicator constraints, so we don't need any \(T\).

Decision Variables


We have two blocks of binary variables. One dealing with assigning jobs to machines and a second one about ordering of jobs on the same machine. The latter is handled by implementing no-overlap constraints. For this we need a binary variable indicating if job \(j_1\) is before or after job \(j_2\).

Binary variables
\[\color{DarkRed}{\mathit{assign}}_{j,m} = \begin{cases} 1 & \text{if job $j$ is placed on machine $m$}\\ 0 & \text{otherwise}\end{cases}\] \[\color{DarkRed}{\mathit{after}}_{j_1,j_2} = \begin{cases} 1 & \text{if job $j_1$ is executed after job $j_2$ when placed on the same machine}\\ 0 & \text{if job $j_1$ is executed before job $j_2$ when placed on the same machine}\end{cases}\]

Note that the variable \(\mathit{after}_{j_1,j_2}\) will be used only for \(j_1 \lt j_2\). This is to avoid double checking the same pair.
We also have a few continuous variables:

Continuous variables
\[\begin{align} & \color{DarkRed}{\mathit{start}}_{j} \ge 0 && \text{start time of job $j$}\\ & \color{DarkRed}{\mathit{finish}}_{j} \ge 0 && \text{finish time of job $j$}\\ & \color{DarkRed}{\mathit{makespan}} \ge 0 && \text{last finish time} \end{align}\]

Constraints


The model can look like:

Mixed Integer Programming Model
\[\begin{align} \min\> & \color{DarkRed}{\mathit{makespan}} \\ & \color{DarkRed}{\mathit{makespan}} \ge \color{DarkRed}{\mathit{finish}}_j && \forall j \\ & \color{DarkRed}{\mathit{finish}}_j = \color{DarkRed}{\mathit{start}}_j + \color{DarkBlue}{\mathit{proctime}}_j && \forall j\\ & \sum_m \color{DarkRed}{\mathit{assign}}_{j,m} = 1 && \forall j\\ & \color{DarkRed}{\mathit{start}}_{j_1} \ge \color{DarkRed}{\mathit{finish}}_{j_2} - \color{DarkBlue}T (1-\color{DarkRed}{\mathit{after}}_{j_1,j_2}) - \color{DarkBlue}T (1-\color{DarkRed}{\mathit{assign}}_{j_1,m})- \color{DarkBlue}T (1-\color{DarkRed}{\mathit{assign}}_{j_2,m}) && \forall m, j_1 \lt j_2 \\ & \color{DarkRed}{\mathit{start}}_{j_2} \ge \color{DarkRed}{\mathit{finish}}_{j_1} - \color{DarkBlue}T \color{DarkRed}{\mathit{after}}_{j_1,j_2} - \color{DarkBlue}T (1-\color{DarkRed}{\mathit{assign}}_{j_1,m})- \color{DarkBlue}T (1-\color{DarkRed}{\mathit{assign}}_{j_2,m}) && \forall m, j_1 \lt j_2 \\ & \color{DarkRed}{\mathit{start}}_{j_2} \ge \color{DarkRed}{\mathit{finish}}_{j_1} && \forall \color{DarkBlue}{\mathit{prec}}_{j_1,j_2} \\ & \color{DarkRed}{\mathit{assign}}_{j,m} = 0 && \forall \text{ not } \color{DarkBlue}{\mathit{ok}}_{j,m} \end{align}\]

The no-overlap constraints are the most complicated in this model. If the problem was a single machine scheduling problem, the no-overlap constraints would look like: \[\mathit{start}_{j_1} \ge \mathit{finish}_{j_2} \text{ or } \mathit{start}_{j_2} \ge \mathit{finish}_{j_1} \>\> \forall j_1 \lt j_2\] Note that we only need to compare jobs \(j_1 \lt j_2\) (each pair needs to be checked only once). The OR condition can be implemented using a binary variable, and big-\(M\) constraints: \[\begin{align} & \mathit{start}_{j_1} \ge \mathit{finish}_{j_2} - T (1-{\mathit{after}}_{j_1,j_2}) && \forall j_1 \lt j_2 \\ & \mathit{start}_{j_2} \ge \mathit{finish}_{j_1} - T {\mathit{after}}_{j_1,j_2} && \forall j_1 \lt j_2 \end{align}\] i.e. job \(j_1\) executes before or after job \(j_2\), but not in parallel. When we have multiple machines, we only want to keep this constraint active when jobs \(j_1\) and \(j_2\) are assigned to the same machine. That is: jobs are allowed to execute in parallel when they are on different machines. In a sense, we have built some nested big-\(M\) constraints.

Results


After running this model we get as results:


            --------------- solution -----

----     66 VARIABLE assign.L  assign job to machine

        machine1    machine2    machine3

job1                   1.000
job2                   1.000
job3       1.000
job4                               1.000


----     66 VARIABLE start.L  job start time

job2 4.000,    job4 4.000


----     66 VARIABLE finish.L  job finish time

job1  4.000,    job2  6.000,    job3 10.000,    job4 16.000


----     66 VARIABLE makespan.L            =       16.000  time last job is finished

----     66 VARIABLE after.L  j1 is scheduled after j2 on machine m

            job3

job1       1.000


How should we interpret the variable \(\mathit{after}\)? Only jobs 1 and 2 are on the same machine and for this combination we have \(\mathit{after}_{1,2}=0\). This means job1 is not after job2, and thus: job2 must be after job1. This is indeed what is happening. The values for \(\mathit{after}\) for jobs not on the same machine are basically just random: they are not used in active constraints. This means: the printed value \(\mathit{after}_{1,3}=1\) is not relevant as jobs 1 and 3 are on different machines.


Graphically this looks like:


We see that this solution obeys the precedence constraints (jobs 2 and 4 after job 1). Furthermore, it only assigns jobs to allowed machines (e.g. job 4 can only run on machine 3).

Larger data set


The above data set is very small. Such a data set is great during development. A beginners mistake I often see is using a large data set during model development. It is much more convenient and efficient using a very small data set when things are in flux, and you do many runs just to get the model right.

Now, it is important to see how the model behaves with a large data set. I used random data with 8 machines and 50 jobs. For this problem we only needed a few seconds to solve to guaranteed optimality. The model had about 20k constraints, 1,700 variables (of which 1,400 binary variables).

solution of 8 machine, 50 job problem with random data

The performance seems to be quite reasonable. The main problem is that for large models the number of constraints becomes big, and the big-\(M\) constraints may not be as tight as we want. But for a first formulation, not so bad.

References



Appendix: large data set


----     64 --------------- data ---

----     64 SET j  jobs

job1 ,    job2 ,    job3 ,    job4 ,    job5 ,    job6 ,    job7 ,    job8 ,    job9 ,    job10,    job11,    job12
job13,    job14,    job15,    job16,    job17,    job18,    job19,    job20,    job21,    job22,    job23,    job24
job25,    job26,    job27,    job28,    job29,    job30,    job31,    job32,    job33,    job34,    job35,    job36
job37,    job38,    job39,    job40,    job41,    job42,    job43,    job44,    job45,    job46,    job47,    job48
job49,    job50


----     64 SET m  machines

machine1,    machine2,    machine3,    machine4,    machine5,    machine6,    machine7,    machine8


----     64 SET ok  allowed job/machine combinations

         machine1    machine2    machine3    machine4    machine5    machine6    machine7    machine8

job1          YES                                 YES         YES         YES         YES
job2          YES                                                                     YES
job3          YES         YES                                 YES         YES         YES         YES
job4                                  YES                                 YES         YES
job5          YES                     YES         YES
job6                      YES         YES         YES         YES         YES
job7                      YES                                             YES         YES         YES
job8                                  YES                     YES         YES
job9          YES         YES                                 YES         YES         YES         YES
job10         YES                     YES                     YES         YES                     YES
job11         YES         YES         YES                     YES         YES         YES         YES
job12         YES                                 YES         YES                     YES
job13         YES                     YES                     YES         YES
job14                     YES         YES         YES         YES                                 YES
job15         YES         YES         YES                     YES         YES         YES
job16         YES         YES         YES         YES         YES                                 YES
job17                                                         YES
job18         YES                                 YES                     YES
job19                     YES         YES         YES         YES                                 YES
job20         YES                     YES                                 YES         YES
job21                                             YES         YES         YES                     YES
job22                     YES                                                         YES
job23                                 YES         YES         YES         YES         YES         YES
job24                                 YES         YES                     YES                     YES
job25                     YES                     YES                                 YES
job26                     YES         YES                                 YES
job27         YES                                 YES                     YES
job28         YES         YES                                 YES
job29         YES                     YES
job30         YES
job31                                 YES                     YES                     YES         YES
job32                                                         YES         YES
job33                     YES                     YES         YES                     YES         YES
job34                                 YES                     YES                     YES
job35         YES                                             YES         YES                     YES
job36                                 YES         YES                                 YES
job37                     YES                     YES                     YES
job38                                             YES                                             YES
job39                                             YES         YES         YES                     YES
job40         YES                                                                     YES
job41                     YES                                 YES         YES
job42                     YES         YES         YES
job43         YES         YES                     YES                                 YES         YES
job44                     YES         YES                                                         YES
job45         YES                                 YES         YES                     YES         YES
job46         YES         YES                     YES                     YES
job47                     YES         YES         YES                     YES         YES
job48                     YES                     YES
job49         YES                                 YES                     YES         YES
job50         YES


----     64 PARAMETER proctime  

job1   7.000,    job2   6.000,    job3   3.000,    job4   3.000,    job5   6.000,    job6   3.000,    job7  10.000
job8   1.000,    job9   7.000,    job10  3.000,    job11  7.000,    job12  9.000,    job13  2.000,    job14  3.000
job15  9.000,    job16 10.000,    job17  2.000,    job18  4.000,    job19  4.000,    job20  8.000,    job21 10.000
job22  6.000,    job23  5.000,    job24  7.000,    job25  1.000,    job26  6.000,    job27 10.000,    job28  7.000
job29  7.000,    job30  5.000,    job31  6.000,    job32 10.000,    job33  9.000,    job34  5.000,    job35  6.000
job36  7.000,    job37  4.000,    job38  9.000,    job39  2.000,    job40  5.000,    job41  7.000,    job42  7.000
job43  7.000,    job44  6.000,    job45  4.000,    job46  4.000,    job47  3.000,    job48  9.000,    job49  9.000
job50  5.000


----     64 SET prec  precedence constraints

             job5        job7       job12       job15       job17       job18       job19       job20       job21

job1                                                          YES
job2                      YES                                 YES
job4          YES
job6                      YES
job9                                                                      YES                     YES
job10                                                                                 YES
job11                                 YES         YES
job12                                                                                                         YES
job14                                                         YES
job15                                                                     YES
job16                                                                                             YES

    +       job22       job23       job24       job25       job26       job27       job28       job29       job30

job2                                                          YES                     YES
job4                                                                      YES                                 YES
job6                                              YES
job8                                                                                  YES
job10                     YES         YES                                 YES
job14                                                                                 YES
job16                     YES
job17         YES                                                                     YES
job18         YES
job21                     YES                                             YES
job23                                                                                 YES
job24                                                         YES                                 YES         YES
job26                                                                                                         YES
job27                                                                                 YES
job29                                                                                                         YES

    +       job31       job32       job33       job34       job35       job36       job37       job38       job39

job2          YES                                                                                 YES
job3                                                                                                          YES
job4          YES
job5                                                                                                          YES
job6          YES                     YES                                 YES
job7                                                                      YES
job8                                                                                              YES
job9                                                                                              YES
job12         YES                                             YES
job14         YES                                                         YES
job15                                             YES                     YES
job17                                                                     YES                     YES         YES
job18                     YES                     YES                                                         YES
job19                     YES         YES                                 YES
job20                                                         YES
job21                                                                                 YES
job22                                                                     YES                     YES         YES
job23                                                                                 YES
job25                                                                                 YES
job26                                                                                 YES                     YES
job27                                                         YES
job30                                                                                                         YES
job31                     YES                                 YES
job32                                 YES

    +       job40       job41       job42       job43       job44       job45       job46       job47       job48

job1                      YES                     YES
job2          YES
job3                      YES                     YES                                                         YES
job4                                  YES                                 YES
job5          YES                     YES
job6                                                                                  YES
job7                      YES                     YES
job8                                                                                              YES
job9                                                                      YES
job10                                                                                 YES
job11                                 YES                                                         YES
job12                     YES
job13                     YES
job14                                                                     YES                     YES
job15                                                                     YES
job17                                                         YES
job20                                             YES
job21                                                                                 YES
job22         YES
job23                     YES
job24         YES                                                                                 YES
job26                                                         YES                     YES
job27                     YES
job28                                                         YES
job30                                                                                 YES
job32                                             YES                                                         YES
job33                                             YES
job34                                                                     YES                                 YES
job35                     YES
job37                                                                                 YES
job38         YES                                 YES
job41                                                                                                         YES
job42                                                         YES                                 YES
job47                                                                                                         YES

    +       job49       job50

job3          YES
job4                      YES
job6          YES
job13                     YES
job14         YES
job19                     YES
job23                     YES
job27         YES
job31         YES
job35                     YES
job42         YES
job45         YES


----     64 PARAMETER T                    =      295.000  max time
            --------------- solution -----

----     64 VARIABLE assign.L  assign job to machine

         machine1    machine2    machine3    machine4    machine5    machine6    machine7    machine8

job1                                            1.000
job2                                                                                1.000
job3        1.000
job4                                1.000
job5                                            1.000
job6                    1.000
job7                                                                                1.000
job8                                1.000
job9                                                                                            1.000
job10                                                                                           1.000
job11                                                                                           1.000
job12                                                                               1.000
job13       1.000
job14                   1.000
job15       1.000
job16                   1.000
job17                                                       1.000
job18       1.000
job19                   1.000
job20                                                                   1.000
job21                                           1.000
job22                                                                               1.000
job23                               1.000
job24                               1.000
job25                   1.000
job26                   1.000
job27                                                                   1.000
job28                   1.000
job29       1.000
job30       1.000
job31                                                                               1.000
job32                                                       1.000
job33                   1.000
job34                               1.000
job35                                                       1.000
job36                               1.000
job37                   1.000
job38                                                                                           1.000
job39                                           1.000
job40                                                                               1.000
job41                                                       1.000
job42                                           1.000
job43                                                                                           1.000
job44                   1.000
job45       1.000
job46       1.000
job47                                                                               1.000
job48                                           1.000
job49       1.000
job50       1.000


----     64 VARIABLE start.L  job start time

job5   7.000,    job7  22.000,    job8   3.000,    job9   7.000,    job10 14.000,    job12  7.000,    job13  3.000
job14  3.000,    job15  7.000,    job16  6.000,    job17  7.000,    job18 16.000,    job19 17.000,    job20 16.000
job21 16.000,    job22 32.000,    job23 29.000,    job24 17.000,    job25 21.000,    job26 24.000,    job27 26.000
job28 41.000,    job29 24.000,    job30 31.000,    job31 16.000,    job32 22.000,    job33 32.000,    job34 24.000
job35 36.000,    job36 38.000,    job37 48.000,    job38 38.000,    job39 38.000,    job40 47.000,    job41 42.000
job42 26.000,    job43 47.000,    job44 52.000,    job45 36.000,    job46 54.000,    job47 38.000,    job48 49.000
job49 40.000,    job50 49.000


----     64 VARIABLE finish.L  job finish time

job1   7.000,    job2   6.000,    job3   3.000,    job4   3.000,    job5  13.000,    job6   3.000,    job7  32.000
job8   4.000,    job9  14.000,    job10 17.000,    job11  7.000,    job12 16.000,    job13  5.000,    job14  6.000
job15 16.000,    job16 16.000,    job17  9.000,    job18 20.000,    job19 21.000,    job20 24.000,    job21 26.000
job22 38.000,    job23 34.000,    job24 24.000,    job25 22.000,    job26 30.000,    job27 36.000,    job28 48.000
job29 31.000,    job30 36.000,    job31 22.000,    job32 32.000,    job33 41.000,    job34 29.000,    job35 42.000
job36 45.000,    job37 52.000,    job38 47.000,    job39 40.000,    job40 52.000,    job41 49.000,    job42 33.000
job43 54.000,    job44 58.000,    job45 40.000,    job46 58.000,    job47 41.000,    job48 58.000,    job49 49.000
job50 54.000


----     64 VARIABLE makespan.L            =       58.000  time last job is finished

----     64 VARIABLE after.L  j1 is scheduled after j2 on machine m

             job3        job9       job10       job11       job12       job13       job14       job15       job16

job1        1.000
job3                    1.000       1.000       1.000
job5                                1.000       1.000                   1.000
job7                                                        1.000                   1.000
job9                                            1.000                                                       1.000
job10                                           1.000
job12                                                                   1.000                   1.000
job15                                                                                                       1.000

    +       job24       job25       job26       job29       job30       job31       job33       job34       job35

job7                                                                    1.000
job20                                                                   1.000
job22                   1.000       1.000                               1.000       1.000
job23       1.000                                                       1.000                   1.000
job24                                                                   1.000
job27                                           1.000       1.000
job28                                           1.000       1.000       1.000       1.000                   1.000

    +       job41       job42       job45       job46       job47       job48       job49       job50

job22                   1.000
job28       1.000       1.000       1.000
job33                   1.000                               1.000
job35                               1.000       1.000
job37                   1.000
job39                   1.000
job40                               1.000                   1.000
job41                   1.000                               1.000
job43                               1.000       1.000                               1.000       1.000
job44                                                                   1.000
job46                                                                               1.000       1.000