Thursday, June 25, 2020

Select m items but optimize for the best k items


Problem Statement


This is a strange one [1]. Consider a standard objective function: \[\max\>\sum_j p_j\cdot x_j\] where \(p\) are coefficients and \(x\) are binary decision variables. This will look at all \(n\) objective coefficients. Now the question is: 

Can I optimize just for the \(k\) best ones?

The problem is that we don't know in advance what the best \(k\) items will be. Just picking the largest three \(p_j\) is not right: possibly not all of these items can be selected at the same time. So the best way is to let the model pick the \(k\) best items. We can do this by introducing a set of binary variable \(\delta_j\) that can have \(k\) of them turned on. To be precise: \[\begin{align} \max & \sum_j \delta_j (p_j\cdot x_j)\\ & \sum_j \delta_j = k \\ & \delta_j \in \{0,1\}\end{align}\] We introduced extra binary decision variables \(\delta\). But also: we made the objective non-linear.

The question is: how can we model this while keeping the model linear.


Example problem


Here is a small multi-dimensional knapsack-like problem:

Base MIP Model
\[ \begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \color{darkred}x_j \in \{0,1\}\end{align}\]

The randomly generated data looks like:


----     16 PARAMETER p  objective coefficients

j1  0.172,    j2  0.843,    j3  0.550,    j4  0.301,    j5  0.292,    j6  0.224,    j7  0.350,    j8  0.856
j9  0.067,    j10 0.500


----     16 PARAMETER a  matrix coefficients

            j1          j2          j3          j4          j5          j6          j7          j8          j9

i1       0.998       0.579       0.991       0.762       0.131       0.640       0.160       0.250       0.669
i2       0.360       0.351       0.131       0.150       0.589       0.831       0.231       0.666       0.776
i3       0.110       0.502       0.160       0.872       0.265       0.286       0.594       0.723       0.628
i4       0.413       0.118       0.314       0.047       0.339       0.182       0.646       0.561       0.770
i5       0.661       0.756       0.627       0.284       0.086       0.103       0.641       0.545       0.032

 +         j10

i1       0.435
i2       0.304
i3       0.464
i4       0.298
i5       0.792


----     16 PARAMETER b  rhs values

i1 2.500,    i2 2.500,    i3 2.500,    i4 2.500,    i5 2.500


----     16 PARAMETER m                    =        5.000  number of x to select


To see the "best" \(x_j\), I reformulated the problem to:

Base MIP Model with y variables
\[ \begin{align} \max& \sum_j \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \color{darkred}x_j \in \{0,1\}\end{align}\]


When we solve this problem we see:


----     41 VARIABLE x.L  binary decision variables

j3 1.000,    j5 1.000,    j6 1.000,    j7 1.000,    j8 1.000


----     41 VARIABLE y.L  auxiliary variables (y=p*x)

j3 0.550,    j5 0.292,    j6 0.224,    j7 0.350,    j8 0.856


----     41 VARIABLE z.L                   =        2.273  objective

----     61 PARAMETER yordered  ordered version of y

k1.j8 0.856
k2.j3 0.550  <=== best three
k3.j7 0.350
k4.j5 0.292
k5.j6 0.224


The model says to select items 3,5,6,7 and 8. The three "best" items are j8 (0.856),  j3 (0.550), and j7 (0.350).

Let's optimize for the best three. Note that these are not necessarily the same items as were the best here. We don't know in advance which are the three best! We change the model to:


MIQP Model
\[ \begin{align} \max& \sum_j \color{darkred}\delta_j \cdot \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}\]


This will automatically indicate the best three items by \(\delta_j=1\) and will optimize just these three. The objective is quadratic and non-convex. So, we can solve this with a non-convex solver (or a solver smart enough to reformulate things automatically). E.g. when we feed this into Cplex we see:

       Classifier predicts products in MIQP should be linearized.[2]

The results look like:



----     75 VARIABLE delta.L  three best y variables

j3  1.000,    j8  1.000,    j10 1.000


----     75 VARIABLE x.L  binary decision variables

j3  1.000,    j5  1.000,    j8  1.000,    j9  1.000,    j10 1.000


----     75 VARIABLE y.L  auxiliary variables (y=p*x)

j3  0.550,    j5  0.292,    j8  0.856,    j9  0.067,    j10 0.500


----     75 VARIABLE z.L                   =        1.907  objective

----     89 PARAMETER yordered  ordered version of y

k1.j8  0.856
k2.j3  0.550  <=== best three
k3.j10 0.500
k4.j5  0.292
k5.j9  0.067


This has changed the best three. The third item has changed from  j7 (with contribution 0.350) to j10 (0.5). The selected items become 3,5,8,9, and 10. This is different than before.

There are few linearizations we can consider:

  • Create ordered variables for \(y\) inside the model. Once we know these, just add the first three to the objective. Sorting variables inside a MIP is not completely trivial.
  • Better is to linearize \(\delta_j \cdot y_j = p_j \cdot \delta_j \cdot x_j\) directly. 
  • Even better is to note that \(\delta_j = 1 \Rightarrow x_j=1\) or \(x_j\ge \delta_j\). This is actually the only thing we need. So we end up with:


Linear MIP Model
\[ \begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}\delta_j \\ & \color{darkred}x_j \ge \color{darkred}\delta_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}\]

The solution looks like:



----     52 VARIABLE x.L  binary decision variables

j3  1.000,    j5  1.000,    j8  1.000,    j9  1.000,    j10 1.000


----     52 VARIABLE delta.L  three best y variables

j3  1.000,    j8  1.000,    j10 1.000


----     52 VARIABLE y.L  auxiliary variables (y=p*x)

j3  0.550,    j5  0.292,    j8  0.856,    j9  0.067,    j10 0.500


----     52 VARIABLE z.L                   =        1.907  objective

----     72 PARAMETER yordered  ordered version of y

k1.j8  0.856
k2.j3  0.550   <=== best three
k3.j10 0.500
k4.j5  0.292
k5.j9  0.067


This is the same solution as obtained by the MIQP model.


Conclusion


It turns out that linearizing the problem of finding the "best three" is rather easy. A little bit surprising to me: I expected to have to do some complicated sorting. Obviously, we exploited here that the decision variables \(x_j\) are binary variables.

References


  1. With R and lpsolve is there a way of optimising for the best 11 elements but still pick 15 elements in total? https://stackoverflow.com/questions/62511975/with-r-and-lpsolve-is-there-a-way-of-optimising-for-the-best-11-elements-but-sti 
  2. Pierre Bonami, Andrea Lodi, Giulia Zarpellon, A Classifier to Decide on the Linearization of Mixed-Integer Quadratic Problems in CPLEX, http://www.optimization-online.org/DB_HTML/2020/03/7662.html

Friday, June 19, 2020

Small example: ordering of binary variables


Problem description


Consider the vector \[v = (1, 3, 6, 4, 7, 9, 6, 2, 3) \] Partition \(v\) into two consecutive subsets, such that the difference of the sum of the values in each subset is minimized. More formally, find the index \(k\) such that \[z = \left| \sum_{i \lt k} v_i - \sum_{i \ge k} v_i \right|\] is minimized. I.e. the sum left of the break point is as close as possible to the sum right of it.

This is easily programmed in code. However, the question is, can we do this in the form of an optimization problem? [1]

Model formulation


The trick is to have a binary variable \(\delta_i \in \{0,1\}\) associated with each observation \(v_i\), \(i=1,\dots, n\). The interpretation is: \[ \delta_i = \begin{cases} 0 & \text{if $v_i$ belongs to the first subset}\\ 1 & \text{if $v_i$ belongs to the second subset}\end{cases}\] To make sure all \(\delta_i = 0\) are to the left and all all \(\delta_i = 1\) are to the right of the break point, we require \[ \delta_{i+1} \ge \delta_{i}\] This implies we start with \(\delta_i = 0\) and when we see \(\delta_i = 1\) it will never go back to zero afterwards. The high-level model can look like:


Integer-programming model
\[\begin{align}\min\>& \color{darkred} z= |\color{darkred}y_{\mathit{left}} - \color{darkred}y_{\mathit{right}}| \\ &\color{darkred}y_{\mathit{left}} = \sum_i (1-\color{darkred}\delta_i) \color{darkblue}v_i \\ &\color{darkred}y_{\mathit{right}} = \sum_i \color{darkred}\delta_i \color{darkblue}v_i \\ & \color{darkred}k = 1 + \sum_i (1-\color{darkred}\delta_i) \\ & \color{darkred}\delta_{i+1} \ge \color{darkred}\delta_{i} \\ & \color{darkred}\delta_{i} \in \{0,1\} \end{align}\]

Notes:

  • The expression \( \sum_i (1-\delta_i) v_i\), sums up \(v_i\) where \(\delta_i=0\), while \( \sum_i \delta_i v_i\), adds up \(v_i\) where \(\delta_i=1\). 
  • Make sure the ordering constraint does not go over the boundary: the constraint \(\delta_{i+1} \ge \delta_{i}\) should only hold for \(i=1,\dots,n-1\).  
  • Obviously, we can also write: \(\delta_{i-1} \le \delta_{i}\). 
  • The absolute value can be linearized by formulating it as \[\begin{align} \min\> & z\\ & -z \le y_{\mathit{left}} - y_{\mathit{right}} \le z\\ & z\ge 0\end{align}\] 
  • It is noted, that there are no big-M constraints in the model. As we shall see later, this model solved very quickly.


The results look like:


----     50 PARAMETER results  

               i1          i2          i3          i4          i5          i6          i7          i8          i9

v               1           3           6           4           7           9           6           2           3
delta                                                                       1           1           1           1


----     50 VARIABLE y.L  sum of partitions

left  21.000,    right 20.000


----     50 VARIABLE k.L                   =            6  breakpoint
            VARIABLE z.L                   =            1  objective


The zeros are not printed.

Of course, the model works for larger sets, and in cases where \(v_i\) are floating-point numbers. They may also contain negative numbers. An example is:


----     51 PARAMETER v  data

i1   -65.651,    i2    68.653,    i3    10.075,    i4   -39.772,    i5   -41.558,    i6   -55.189,    i7   -30.034
i8    71.254,    i9   -86.577,    i10    0.042,    i11   99.624,    i12   15.747,    i13   98.227,    i14   52.450
i15  -73.862,    i16   27.944,    i17  -68.096,    i18  -49.984,    i19   33.786,    i20  -12.929,    i21  -28.060
i22  -29.712,    i23  -73.702,    i24  -69.980,    i25   17.823,    i26   66.179,    i27  -53.837,    i28   33.147
i29   55.172,    i30  -39.268,    i31  -77.902,    i32    0.477,    i33  -67.965,    i34   74.492,    i35  -46.977
i36  -42.837,    i37   18.791,    i38   44.544,    i39   25.650,    i40   -7.240,    i41  -17.339,    i42  -76.461
i43  -37.158,    i44  -90.690,    i45  -32.290,    i46  -63.580,    i47   29.145,    i48   12.149,    i49   53.992
i50  -40.439,    i51   32.221,    i52   51.164,    i53   25.489,    i54  -43.227,    i55  -82.715,    i56  -79.497
i57   28.250,    i58    9.062,    i59  -93.695,    i60   58.472,    i61  -85.447,    i62  -64.868,    i63    5.127
i64   50.042,    i65  -64.375,    i66  -93.172,    i67   17.026,    i68   24.246,    i69  -22.128,    i70  -28.257
i71  -51.393,    i72  -50.716,    i73  -73.899,    i74   86.690,    i75  -24.012,    i76   56.680,    i77  -39.993
i78  -74.903,    i79   49.775,    i80  -86.154,    i81  -59.597,    i82  -98.987,    i83  -46.077,    i84   -0.030
i85  -69.743,    i86  -65.166,    i87  -33.872,    i88  -36.619,    i89  -35.583,    i90   92.795,    i91   98.720
i92  -26.019,    i93  -25.422,    i94   54.396,    i95  -20.663,    i96   82.619,    i97  -76.084,    i98   47.096
i99  -88.916,    i100  15.260


----     51 VARIABLE delta.L  ordered binary variables

i62  1,    i63  1,    i64  1,    i65  1,    i66  1,    i67  1,    i68  1,    i69  1,    i70  1,    i71  1,    i72  1
i73  1,    i74  1,    i75  1,    i76  1,    i77  1,    i78  1,    i79  1,    i80  1,    i81  1,    i82  1,    i83  1
i84  1,    i85  1,    i86  1,    i87  1,    i88  1,    i89  1,    i90  1,    i91  1,    i92  1,    i93  1,    i94  1
i95  1,    i96  1,    i97  1,    i98  1,    i99  1,    i100 1


----     51 VARIABLE y.L  sum of partitions

left  -689.647,    right -676.178


----     51 VARIABLE k.L                   =           62  breakpoint
            VARIABLE z.L                   =       13.469  objective

This model with \(n=100\) solves instantaneously. For \(n=1,000\) this becomes a second or so.


The ordering constraint is a useful construct to have in your toolbox. This is used in other contexts as well. It can be used to order solutions for better display and interpretation. It can also be used to reduce symmetry in certain models.

GAMS implementation


Here follows the GAMS model. Note the handling of the order constraint with the lagged variable reference. This can be debugged (or verified its correctness) by inspecting the Equation Listing (after increasing option limrow).

set
   i
'observations' /i1*i9/
   j
'partitions' /left,right/
;

parameter v 'data' /i1 1, i2 3, i3 6, i4 4, i5 7, i6 9, i7 6, i8 2, i9 3/;
display v;

binary variable delta(i) 'ordered binary variables';
variable
   z    
'objective'
   y(j) 
'sum of partitions'
   k    
'breakpoint'
;

equations
   obj1     
'linearized objective'
   obj2     
'linearized objective'
   ydef1    
'y(left)'
   ydef2    
'y(right)'
   order(i) 
'ordering of binary variables'
   kdef     
'breakpoint'
;

*
* objective: min abs(y('left')-y('right'))
* we linearize this
*
obj1.. -z =l= y(
'left')-y('right');
obj2..        y(
'left')-y('right') =l= z;
z.lo = 0;

*
*  calculate y
*
ydef1.. y(
'left') =e= sum(i, (1-delta(i))*v(i));
ydef2.. y(
'right') =e= sum(i, delta(i)*v(i));

*
* ordering of binary variables
* check the equation listing (after using option limrow)
*
option limrow=10;
order(i-1).. delta(i) =g= delta(i-1);

*
* breakpoint
*
kdef.. k =e= 1+
sum(i,(1-delta(i)));

model m /all/;
option optcr=0;
solve m minimizing z using mip;

*
* print results
*

parameter results(*,*);
results(
'v',i) = v(i);
results(
'delta',i) = delta.l(i);

option results:0,k:0,z:0;
display
     results,
     y.l,
     k.l,
     z.l
;



References


  1. Optimize with indexing in linear programming, https://stackoverflow.com/questions/62432731/optimize-with-indexing-in-linear-programming

Friday, June 12, 2020

Designing a sports schedule



In [1] a question was posed about designing a sports schedule.

We have the following information:

  • There are 24 teams
  • There are 23 rounds
  • In every round, each team plays against another team. A team plays exactly once in a round. I.e., there are 12 games per round.
  • A team \(t1\) plays against team \(t2\) exactly once.

This is actually not too difficult to model.

Math Programming Model


We define as binary decision variable: \[\mathit{game}_{t1,t2,r} = \begin{cases} 1 & \text{if team \(t1\) plays against team \(t2\) in round \(r\)}\\ 0 & \text{otherwise} \end{cases}\] To prevent double counting, we use the convention that \(\mathit{game}_{t1,t2,r}\) is only defined when \(t1 \lt t2\). So, we have in the model \(\mathit{game}_{team2,team8,round4}\), but we will never use: \(\mathit{game}_{team8,team2,round4}\). This is a little bit of a complicating factor in the model, but the pay-off is that the bookkeeping becomes in the end simpler.

With this we can formulate:

Mixed-Integer Programming Model
\[\begin{align}\min\>& 0 && &&\text{dummy objective} \\ & \sum_{t2|t1\lt t2} \color{darkred}{\mathit{game}}_{t1,t2,r} + \sum_{t2|t2\lt t1} \color{darkred}{\mathit{game}}_{t2,t1,r} = 1 && \forall t1,r &&\text{play exactly one opponent each round}\\ & \sum_r \color{darkred}{\mathit{game}}_{t1,t2,r}=1&&\forall t1,t2|t1 \lt t2 && \text{meet opponent exactly once}\\ &\color{darkred}{\mathit{game}}_{t1,t2,r} \in \{0,1\} &&\forall t1,t2,r|t1 \lt t2 &&\text{binary variables} \end{align}\]

The model does not have a real objective: we are just looking for a feasible solution. The first constraint is the most complicated one. Basically it says \[\sum_{t2} {\mathit{game}}_{t1,t2,r} = 1 \>\> \forall t1,r\] but as we only use \({\mathit{game}}_{t1,t2,r}\) for \(t1 \lt t2\), we need to split things in two. This can be visualized as follows:

Opponents of team t3

If we would not have excluded all \(t2\ge t1\), but used a fully allocated \({\mathit{game}}_{t1,t2,r}\) we would have needed to add the constraints: \[\begin{align} & {\mathit{game}}_{t1,t2,r} = {\mathit{game}}_{t2,t1,r} \\ & {\mathit{game}}_{t1,t1,r}=0 \end{align}\] This will make the model larger, but a good presolver should help with that. Although for this model it probably does not make much difference, I prefer explicitly excluding all \(t2\ge t1\).

This is a rather simple model. But it will deliver the schedules we are looking for:



----     24 VARIABLE game.L  schedule for teams

                   round1      round2      round3      round4      round5      round6      round7      round8

team1 .team15                                   1
team1 .team16                                               1
team1 .team17                                                                                   1
team1 .team18                                                           1
team1 .team19           1
team1 .team20                                                                                               1
team1 .team22                                                                       1
team1 .team24                       1
team2 .team12                                                                                   1
team2 .team15                                               1
team2 .team16                                                                                               1
team2 .team19                       1
team2 .team20                                                           1
team2 .team21                                                                       1
team2 .team23           1
team2 .team24                                   1
team3 .team12           1
team3 .team13                                                                       1
team3 .team14                                                                                               1
team3 .team15                       1
team3 .team16                                                                                   1
team3 .team21                                                           1
team3 .team23                                   1
team3 .team24                                               1
team4 .team10                                                                                               1
team4 .team14                                                                                   1
team4 .team15           1
team4 .team16                                                           1
team4 .team18                       1
team4 .team21                                               1
team4 .team22                                   1
team4 .team23                                                                       1
team5 .team11                                               1
team5 .team15                                                           1
team5 .team16                       1
team5 .team18                                                                                   1
team5 .team19                                                                                               1
team5 .team20                                                                       1
team5 .team21                                   1
team5 .team22           1
team6 .team9                                    1
team6 .team10                                                           1
team6 .team11           1
team6 .team16                                                                       1
team6 .team17                                                                                               1
team6 .team20                                                                                   1
team6 .team22                                               1
team6 .team23                       1
team7 .team9                        1
team7 .team10                                                                                   1
team7 .team12                                                           1
team7 .team14                                                                       1
team7 .team16                                   1
team7 .team19                                               1
team7 .team21                                                                                               1
team7 .team24           1
team8 .team9                                                1
team8 .team11                                   1
team8 .team12                                                                       1
team8 .team13                                                                                               1
team8 .team14                       1
team8 .team16           1
team8 .team19                                                           1
team8 .team22                                                                                   1
team9 .team18                                                                       1
team9 .team20           1
team9 .team21                                                                                   1
team9 .team22                                                           1
team9 .team23                                                                                               1
team10.team17           1
team10.team20                                   1
team10.team21                       1
team10.team23                                               1
team10.team24                                                                       1
team11.team17                       1
team11.team18                                                                                               1
team11.team19                                                                       1
team11.team23                                                                                   1
team11.team24                                                           1
team12.team17                                   1
team12.team18                                               1
team12.team22                       1
team12.team24                                                                                               1
team13.team17                                               1
team13.team18           1
team13.team19                                   1
team13.team20                       1
team13.team23                                                           1
team13.team24                                                                                   1
team14.team17                                                           1
team14.team18                                   1
team14.team20                                               1
team14.team21           1
team15.team17                                                                       1
team15.team19                                                                                   1
team15.team22                                                                                               1

            +      round9     round10     round11     round12     round13     round14     round15     round16

team1 .team9                                                                                                1
team1 .team10                                                                                   1
team1 .team11                                                                       1
team1 .team12                                                           1
team1 .team13                                               1
team1 .team14                                   1
team1 .team21                       1
team1 .team23           1
team2 .team9                                                                                    1
team2 .team10                                               1
team2 .team11                                                           1
team2 .team13                       1
team2 .team14           1
team2 .team17                                                                                               1
team2 .team18                                                                       1
team2 .team22                                   1
team3 .team9                                                                        1
team3 .team10                                   1
team3 .team11                                                                                               1
team3 .team17                                                           1
team3 .team18                                                                                   1
team3 .team19                       1
team3 .team20           1
team3 .team22                                               1
team4 .team9                                                            1
team4 .team11           1
team4 .team12                       1
team4 .team13                                                                       1
team4 .team17                                                                                   1
team4 .team19                                                                                               1
team4 .team20                                   1
team4 .team24                                               1
team5 .team9                                                1
team5 .team10                                                           1
team5 .team12                                   1
team5 .team13                                                                                               1
team5 .team14                                                                                   1
team5 .team17                                                                       1
team5 .team23                       1
team5 .team24           1
team6 .team12           1
team6 .team13                                                                                   1
team6 .team14                       1
team6 .team15                                                                       1
team6 .team18                                               1
team6 .team19                                                           1
team6 .team21                                                                                               1
team6 .team24                                   1
team7 .team11                                               1
team7 .team13                                   1
team7 .team15                                                                                               1
team7 .team17                       1
team7 .team18           1
team7 .team20                                                           1
team7 .team22                                                                                   1
team7 .team23                                                                       1
team8 .team10                       1
team8 .team15           1
team8 .team17                                               1
team8 .team18                                   1
team8 .team20                                                                       1
team8 .team21                                                                                   1
team8 .team23                                                                                               1
team8 .team24                                                           1
team9 .team17                                   1
team9 .team19           1
team9 .team24                       1
team10.team18                                                                                               1
team10.team19                                                                       1
team10.team22           1
team11.team20                                                                                   1
team11.team21                                   1
team11.team22                       1
team12.team19                                                                                   1
team12.team20                                                                                               1
team12.team21                                                                       1
team12.team23                                               1
team13.team21           1
team13.team22                                                           1
team14.team19                                               1
team14.team22                                                                                               1
team14.team23                                                           1
team14.team24                                                                       1
team15.team18                                                           1
team15.team20                       1
team15.team21                                               1
team15.team23                                   1
team15.team24                                                                                   1
team16.team17           1
team16.team18                       1
team16.team19                                   1
team16.team20                                               1
team16.team21                                                           1
team16.team22                                                                       1
team16.team23                                                                                   1
team16.team24                                                                                               1

            +     round17     round18     round19     round20     round21     round22     round23

team1 .team2                                                                                    1
team1 .team3                                                                        1
team1 .team4                                                            1
team1 .team5                                                1
team1 .team6                                    1
team1 .team7                        1
team1 .team8            1
team2 .team3                                                            1
team2 .team4                                                                        1
team2 .team5                                    1
team2 .team6                                                1
team2 .team7            1
team2 .team8                        1
team3 .team4                                                                                    1
team3 .team5                        1
team3 .team6            1
team3 .team7                                                1
team3 .team8                                    1
team4 .team5            1
team4 .team6                        1
team4 .team7                                    1
team4 .team8                                                1
team5 .team6                                                                                    1
team5 .team7                                                                        1
team5 .team8                                                            1
team6 .team7                                                            1
team6 .team8                                                                        1
team7 .team8                                                                                    1
team9 .team10                                                                                   1
team9 .team11                                                                       1
team9 .team12                                                           1
team9 .team13                                               1
team9 .team14                                   1
team9 .team15                       1
team9 .team16           1
team10.team11                                                           1
team10.team12                                                                       1
team10.team13                                   1
team10.team14                                               1
team10.team15           1
team10.team16                       1
team11.team12                                                                                   1
team11.team13                       1
team11.team14           1
team11.team15                                               1
team11.team16                                   1
team12.team13           1
team12.team14                       1
team12.team15                                   1
team12.team16                                               1
team13.team14                                                                                   1
team13.team15                                                                       1
team13.team16                                                           1
team14.team15                                                           1
team14.team16                                                                       1
team15.team16                                                                                   1
team17.team18                                                                                   1
team17.team19                                                                       1
team17.team20                                                           1
team17.team21                                               1
team17.team22                                   1
team17.team23                       1
team17.team24           1
team18.team19                                                           1
team18.team20                                                                       1
team18.team21                                   1
team18.team22                                               1
team18.team23           1
team18.team24                       1
team19.team20                                                                                   1
team19.team21                       1
team19.team22           1
team19.team23                                               1
team19.team24                                   1
team20.team21           1
team20.team22                       1
team20.team23                                   1
team20.team24                                               1
team21.team22                                                                                   1
team21.team23                                                                       1
team21.team24                                                           1
team22.team23                                                           1
team22.team24                                                                       1
team23.team24                                                                                   1


This model solved very quickly, it needed about 0.2 seconds. Of course, the solution is not unique.


GAMS representation


set
  t
'team' /team1*team24/
  r
'rounds' /round1*round23/
;

alias (t,t1,t2);

set lt(t1,t2) 'less than';
lt(t1,t2) =
ord(t1)<ord(t2);

binary variable game(t1,t2,r) 'schedule for teams';
variable dummy 'objective variable';

equations
   round(r,t1)
'play exactly one opponent each round'
   meet(t1,t2)
't1 meets t2 exactly once'
   obj        
'dummy objective'
;

round(r,t1)..
  
sum(lt(t1,t2),game(t1,t2,r))+sum(lt(t2,t1),game(t2,t1,r)) =e= 1;
meet(lt(t1,t2))..
sum(r,game(t1,t2,r)) =e= 1;
obj.. dummy=e=0;

model m /all/;
solve m minimizing dummy using mip;

option game:0;
display game.l;



Notice how the set lt(t1,t2) implements the strictly upper-triangular structure we want. We use the GAMS feature that if a variable does not appear in any of the equations, it will not be generated, even though we declared all game(t1,t2,r). Note that we don't need to set a gap using the optcr option. The solver will stop as soon as it has found a feasible integer solution: that solution is immediately optimal.

References


  1. Excel solver failure to get optimal solution when generating a schedule for 12 week schedule for 24 teams, https://stackoverflow.com/questions/62327762/excel-solver-failure-to-get-optimal-solution-when-generating-a-schedule-for-12-w

A Set Partitioning Problem and Infeasible Data Sets

In [1], a problem is stated that can be interpreted as a Set Partitioning Problem [2].

Consider the following data:


----     16 PARAMETER data  input data

            start      length        cost

job1            2           5          47
job2           19           5          20
job3           11           2          38
job4            5           2          14
job5            5           8          40
job6            3          10          26
job7            6           3          68
job8           19           5          61
job9                       10          80
job10          10           4          37
job11          22           2          70
job12          12           7          78
job13          22           2          67
job14          17           7          35
job15           1           4          17
job16          13           4          19
job17           1           8          68
job18           4           9          59
job19          14           8          12
job20           8           6          82

Note: zeros are not printed.

The problem is:

Find a subset of jobs that don't overlap and cover each hour 0 through 23, and that minimize cost.

 This is slightly different than the data in [1], in order to make the problem feasible.

Set Partitioning Model  


Before we can work on the model itself, we need to develop some derived data. We introduce the boolean parameter \(\mathit{Cover}_{i,t}\) defined by: \[\mathit{Cover}_{i,t}=\begin{cases} 1 & \text{if job $i$ covers hour $t$}\\ 0 & \text{otherwise}\end{cases}\] This parameter looks like:


----     16 PARAMETER cover  coverage of hours by jobs

              h00         h01         h02         h03         h04         h05         h06         h07         h08

job1                                    1           1           1           1           1
job4                                                                        1           1
job5                                                                        1           1           1           1
job6                                                1           1           1           1           1           1
job7                                                                                    1           1           1
job9            1           1           1           1           1           1           1           1           1
job15                       1           1           1           1
job17                       1           1           1           1           1           1           1           1
job18                                                           1           1           1           1           1
job20                                                                                                           1

    +         h09         h10         h11         h12         h13         h14         h15         h16         h17

job3                                    1           1
job5            1           1           1           1
job6            1           1           1           1
job9            1
job10                       1           1           1           1
job12                                               1           1           1           1           1           1
job14                                                                                                           1
job16                                                           1           1           1           1
job18           1           1           1           1
job19                                                                       1           1           1           1
job20           1           1           1           1           1

    +         h18         h19         h20         h21         h22         h23

job2                        1           1           1           1           1
job8                        1           1           1           1           1
job11                                                           1           1
job12           1
job13                                                           1           1
job14           1           1           1           1           1           1
job19           1           1           1           1


We further introduce the binary decision variables: \[x_i = \begin{cases} 1 & \text{if job $i$ is selected} \\ 0 & \text{otherwise}\end{cases}\] With this we can formulate:

Set Partitioning Model
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot \color{darkred}x_i \\ & \sum_i \color{darkblue}{\mathit{Cover}}_{i,t} \cdot \color{darkred}x_i = 1 &&\forall t \\ & \color{darkred}x_i \in \{0,1\} \end{align}\]

When we solve this problem using the above data set we get the solution:


----     31 VARIABLE x.L  selected jobs

job9  1,    job10 1,    job13 1,    job19 1


----     31 VARIABLE tcost.L               =          196  total cost


It is noted that the model can also be stated using a set representation of the coverage data. Such a model can look like:


Set Partitioning Model 2
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot \color{darkred}x_i \\ & \sum_{i|\color{darkblue}{\mathit{Cover}}(i,t)} \color{darkred}x_i = 1 &&\forall t \\ & \color{darkred}x_i \in \{0,1\} \end{align}\]

Although this looks somewhat different, it is really exactly the same. This is actually the form I use most of the time.

Infeasibilities


It is quite easy to see data sets that cause the problem to be infeasible. Indeed, the data set shown in [1] is actually yielding a model that is not feasible.

There are two ways in which the model can be infeasible. The first is the easy one: we have no jobs at all that cover a time period \(t\).  Consider the data:


----     16 PARAMETER data  input data

            start      length        cost

job1            2          12          63
job2           19           5          85
job3           11           2          31
job4            5           8          70
job5            5           2          80
job6            3           4          37
job7            6           9          20
job8           19           5          55
job9                        5          24
job10          10           5          89
job11          22           2          34
job12          12           2          36


----     16 PARAMETER cover  coverage of hours by jobs

              h00         h01         h02         h03         h04         h05         h06         h07         h08

job1                                    1           1           1           1           1           1           1
job4                                                                        1           1           1           1
job5                                                                        1           1
job6                                                1           1           1           1
job7                                                                                    1           1           1
job9            1           1           1           1           1

    +         h09         h10         h11         h12         h13         h14         h19         h20         h21

job1            1           1           1           1           1
job2                                                                                    1           1           1
job3                                    1           1
job4            1           1           1           1
job7            1           1           1           1           1           1
job8                                                                                    1           1           1
job10                       1           1           1           1           1
job12                                               1           1

    +         h22         h23

job2            1           1
job8            1           1
job11           1           1


We see that hours 15, 16, 17, and 18 are not covered by any job. This can be easily checked in advance. Even GAMS is complaining when generating the model:


**** Exec Error at line 24: Equation infeasible due to rhs value

**** INFEASIBLE EQUATIONS ...

---- ecover  =E=  

ecover(h15)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)
     
ecover(h16)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)
     
ecover(h17)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)
     
ecover(h18)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)


There is another form of infeasibility that is more difficult to diagnose. When we use the data set:


----     16 PARAMETER data  input data

            start      length        cost

job1            2           6          67
job2           19           5          52
job3           11           5          47
job4            5           2          20
job5            5           2          38
job6            3           8          14
job7            6          10          40
job8           19           3          26
job9                        8          68
job10          10          10          61
job11          22           2          80
job12          12           2          37
job13          22           2          70
job14          17           2          78
job15           1          11          67
job16          13           4          35
job17           1           4          17
job18           4           8          19
job19          14           9          68

we get:


               S O L V E      S U M M A R Y

     MODEL   m                   OBJECTIVE  tcost
     TYPE    MIP                 DIRECTION  MINIMIZE
     SOLVER  CPLEX               FROM LINE  28

**** SOLVER STATUS     1 Normal Completion         
**** MODEL STATUS      10 Integer Infeasible       
**** OBJECTIVE VALUE               NA


There is not much in diagnostics for most solvers. Cplex at least reports:


Infeasibility row 'ecover(h08)':  0  = 1.

This is better than most solvers, but may be difficult to digest for an end-user. In applications, it may be better to make sure the problem is never infeasible. Basically, allow the constraint to become infeasible but at a price. This is sometimes called an elastic formulation. There are two ways to do this for this model:

  1. use a data-driven method: introduce expensive emergency jobs used to make the constraint feasible
  2. change the constraint directly

Data-driven Elastic model 


We can add expensive filler-jobs to our data set as follows:


----     21 PARAMETER data  input data

               start      length        cost

job1               2           6          67
job2              19           5          52
job3              11           5          47
job4               5           2          20
job5               5           2          38
job6               3           8          14
job7               6          10          40
job8              19           3          26
job9                           8          68
job10             10          10          61
job11             22           2          80
job12             12           2          37
job13             22           2          70
job14             17           2          78
job15              1          11          67
job16             13           4          35
job17              1           4          17
job18              4           8          19
job19             14           9          68
notcov00                       1         999
notcov01           1           1         999
notcov02           2           1         999
notcov03           3           1         999
notcov04           4           1         999
notcov05           5           1         999
notcov06           6           1         999
notcov07           7           1         999
notcov08           8           1         999
notcov09           9           1         999
notcov10          10           1         999
notcov11          11           1         999
notcov12          12           1         999
notcov13          13           1         999
notcov14          14           1         999
notcov15          15           1         999
notcov16          16           1         999
notcov17          17           1         999
notcov18          18           1         999
notcov19          19           1         999
notcov20          20           1         999
notcov21          21           1         999
notcov22          22           1         999
notcov23          23           1         999


When we solve this, we see the following solution:


----     36 VARIABLE x.L  selected jobs

job12    1,    job15    1,    job19    1,    notcov00 1,    notcov23 1


----     36 VARIABLE tcost.L               =         2170  total cost


So we have trouble covering 2 hours with jobs. If we allow then to remain like that, the best solution is to use jobs 12, 15 and 19 for the other hours.


Elastic model: change constraint


Instead of changing the data, we can also do the same by changing the covering constraint:


Set Partitioning Model, elastic version
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot \color{darkred}x_i + \color{darkblue}{\mathit{Penalty}} \cdot \sum_t\color{darkred}{\mathit{Uncov}}_t \\ & \sum_i \color{darkblue}{\mathit{Cover}}_{i,t} \cdot \color{darkred}x_i + \color{darkred}{\mathit{Uncov}}_t = 1 &&\forall t \\ & \color{darkred}x_i \in \{0,1\} \\ & \color{darkred}{\mathit{Uncov}}_t \in [0,1] \end{align}\]

You can see we added a slack variable \({\mathit{Uncov}}_t\), and included this in the objective with a penalty (999). When we run this model, we see:


----     44 VARIABLE x.L  selected jobs

job12 1,    job15 1,    job19 1


----     44 VARIABLE uncov.L  uncovered hours

h00 1,    h23 1


----     44 VARIABLE tcost.L               =         2170  total cost (objective)
            PARAMETER jobcost              =          172  cost related to running jobs
            PARAMETER penalties            =         1998  for uncovered hours


I use this type of elastic formulations a lot. This approach prevents a user from seeing "sorry model was infeasible". Instead, it gives back a meaningful result.

References


  1. Profit maximising job scheduling using Python, https://stackoverflow.com/questions/62297792/profit-maximising-job-scheduling-using-python
  2. Garfinkel, R. S., and G. L. Nemhauser. “The Set-Partitioning Problem: Set Covering with Equality Constraints.” Operations Research, vol. 17, no. 5, 1969, pp. 848–856.