## Friday, June 12, 2020

### A Set Partitioning Problem and Infeasible Data Sets

In [1], a problem is stated that can be interpreted as a Set Partitioning Problem [2].

Consider the following data:

----     16 PARAMETER data  input data

start      length        cost

job1            2           5          47
job2           19           5          20
job3           11           2          38
job4            5           2          14
job5            5           8          40
job6            3          10          26
job7            6           3          68
job8           19           5          61
job9                       10          80
job10          10           4          37
job11          22           2          70
job12          12           7          78
job13          22           2          67
job14          17           7          35
job15           1           4          17
job16          13           4          19
job17           1           8          68
job18           4           9          59
job19          14           8          12
job20           8           6          82


Note: zeros are not printed.

The problem is:

Find a subset of jobs that don't overlap and cover each hour 0 through 23, and that minimize cost.

This is slightly different than the data in [1], in order to make the problem feasible.

#### Set Partitioning Model

Before we can work on the model itself, we need to develop some derived data. We introduce the boolean parameter $$\mathit{Cover}_{i,t}$$ defined by: $\mathit{Cover}_{i,t}=\begin{cases} 1 & \text{if job i covers hour t}\\ 0 & \text{otherwise}\end{cases}$ This parameter looks like:

----     16 PARAMETER cover  coverage of hours by jobs

h00         h01         h02         h03         h04         h05         h06         h07         h08

job1                                    1           1           1           1           1
job4                                                                        1           1
job5                                                                        1           1           1           1
job6                                                1           1           1           1           1           1
job7                                                                                    1           1           1
job9            1           1           1           1           1           1           1           1           1
job15                       1           1           1           1
job17                       1           1           1           1           1           1           1           1
job18                                                           1           1           1           1           1
job20                                                                                                           1

+         h09         h10         h11         h12         h13         h14         h15         h16         h17

job3                                    1           1
job5            1           1           1           1
job6            1           1           1           1
job9            1
job10                       1           1           1           1
job12                                               1           1           1           1           1           1
job14                                                                                                           1
job16                                                           1           1           1           1
job18           1           1           1           1
job19                                                                       1           1           1           1
job20           1           1           1           1           1

+         h18         h19         h20         h21         h22         h23

job2                        1           1           1           1           1
job8                        1           1           1           1           1
job11                                                           1           1
job12           1
job13                                                           1           1
job14           1           1           1           1           1           1
job19           1           1           1           1


We further introduce the binary decision variables: $x_i = \begin{cases} 1 & \text{if job i is selected} \\ 0 & \text{otherwise}\end{cases}$ With this we can formulate:

Set Partitioning Model
\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot \color{darkred}x_i \\ & \sum_i \color{darkblue}{\mathit{Cover}}_{i,t} \cdot \color{darkred}x_i = 1 &&\forall t \\ & \color{darkred}x_i \in \{0,1\} \end{align}

When we solve this problem using the above data set we get the solution:

----     31 VARIABLE x.L  selected jobs

job9  1,    job10 1,    job13 1,    job19 1

----     31 VARIABLE tcost.L               =          196  total cost


It is noted that the model can also be stated using a set representation of the coverage data. Such a model can look like:

Set Partitioning Model 2
\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot \color{darkred}x_i \\ & \sum_{i|\color{darkblue}{\mathit{Cover}}(i,t)} \color{darkred}x_i = 1 &&\forall t \\ & \color{darkred}x_i \in \{0,1\} \end{align}

Although this looks somewhat different, it is really exactly the same. This is actually the form I use most of the time.

#### Infeasibilities

It is quite easy to see data sets that cause the problem to be infeasible. Indeed, the data set shown in [1] is actually yielding a model that is not feasible.

There are two ways in which the model can be infeasible. The first is the easy one: we have no jobs at all that cover a time period $$t$$.  Consider the data:

----     16 PARAMETER data  input data

start      length        cost

job1            2          12          63
job2           19           5          85
job3           11           2          31
job4            5           8          70
job5            5           2          80
job6            3           4          37
job7            6           9          20
job8           19           5          55
job9                        5          24
job10          10           5          89
job11          22           2          34
job12          12           2          36

----     16 PARAMETER cover  coverage of hours by jobs

h00         h01         h02         h03         h04         h05         h06         h07         h08

job1                                    1           1           1           1           1           1           1
job4                                                                        1           1           1           1
job5                                                                        1           1
job6                                                1           1           1           1
job7                                                                                    1           1           1
job9            1           1           1           1           1

+         h09         h10         h11         h12         h13         h14         h19         h20         h21

job1            1           1           1           1           1
job2                                                                                    1           1           1
job3                                    1           1
job4            1           1           1           1
job7            1           1           1           1           1           1
job8                                                                                    1           1           1
job10                       1           1           1           1           1
job12                                               1           1

+         h22         h23

job2            1           1
job8            1           1
job11           1           1


We see that hours 15, 16, 17, and 18 are not covered by any job. This can be easily checked in advance. Even GAMS is complaining when generating the model:

**** Exec Error at line 24: Equation infeasible due to rhs value

**** INFEASIBLE EQUATIONS ...

---- ecover  =E=

ecover(h15)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)

ecover(h16)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)

ecover(h17)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)

ecover(h18)..  0 =E= 1 ; (LHS = 0, INFES = 1 ****)


There is another form of infeasibility that is more difficult to diagnose. When we use the data set:

----     16 PARAMETER data  input data

start      length        cost

job1            2           6          67
job2           19           5          52
job3           11           5          47
job4            5           2          20
job5            5           2          38
job6            3           8          14
job7            6          10          40
job8           19           3          26
job9                        8          68
job10          10          10          61
job11          22           2          80
job12          12           2          37
job13          22           2          70
job14          17           2          78
job15           1          11          67
job16          13           4          35
job17           1           4          17
job18           4           8          19
job19          14           9          68


we get:

               S O L V E      S U M M A R Y

MODEL   m                   OBJECTIVE  tcost
TYPE    MIP                 DIRECTION  MINIMIZE
SOLVER  CPLEX               FROM LINE  28

**** SOLVER STATUS     1 Normal Completion
**** MODEL STATUS      10 Integer Infeasible
**** OBJECTIVE VALUE               NA


There is not much in diagnostics for most solvers. Cplex at least reports:

Infeasibility row 'ecover(h08)':  0  = 1.


This is better than most solvers, but may be difficult to digest for an end-user. In applications, it may be better to make sure the problem is never infeasible. Basically, allow the constraint to become infeasible but at a price. This is sometimes called an elastic formulation. There are two ways to do this for this model:

1. use a data-driven method: introduce expensive emergency jobs used to make the constraint feasible
2. change the constraint directly

#### Data-driven Elastic model

We can add expensive filler-jobs to our data set as follows:

----     21 PARAMETER data  input data

start      length        cost

job1               2           6          67
job2              19           5          52
job3              11           5          47
job4               5           2          20
job5               5           2          38
job6               3           8          14
job7               6          10          40
job8              19           3          26
job9                           8          68
job10             10          10          61
job11             22           2          80
job12             12           2          37
job13             22           2          70
job14             17           2          78
job15              1          11          67
job16             13           4          35
job17              1           4          17
job18              4           8          19
job19             14           9          68
notcov00                       1         999
notcov01           1           1         999
notcov02           2           1         999
notcov03           3           1         999
notcov04           4           1         999
notcov05           5           1         999
notcov06           6           1         999
notcov07           7           1         999
notcov08           8           1         999
notcov09           9           1         999
notcov10          10           1         999
notcov11          11           1         999
notcov12          12           1         999
notcov13          13           1         999
notcov14          14           1         999
notcov15          15           1         999
notcov16          16           1         999
notcov17          17           1         999
notcov18          18           1         999
notcov19          19           1         999
notcov20          20           1         999
notcov21          21           1         999
notcov22          22           1         999
notcov23          23           1         999


When we solve this, we see the following solution:

----     36 VARIABLE x.L  selected jobs

job12    1,    job15    1,    job19    1,    notcov00 1,    notcov23 1

----     36 VARIABLE tcost.L               =         2170  total cost


So we have trouble covering 2 hours with jobs. If we allow then to remain like that, the best solution is to use jobs 12, 15 and 19 for the other hours.

#### Elastic model: change constraint

Instead of changing the data, we can also do the same by changing the covering constraint:

Set Partitioning Model, elastic version
\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot \color{darkred}x_i + \color{darkblue}{\mathit{Penalty}} \cdot \sum_t\color{darkred}{\mathit{Uncov}}_t \\ & \sum_i \color{darkblue}{\mathit{Cover}}_{i,t} \cdot \color{darkred}x_i + \color{darkred}{\mathit{Uncov}}_t = 1 &&\forall t \\ & \color{darkred}x_i \in \{0,1\} \\ & \color{darkred}{\mathit{Uncov}}_t \in [0,1] \end{align}

You can see we added a slack variable $${\mathit{Uncov}}_t$$, and included this in the objective with a penalty (999). When we run this model, we see:

----     44 VARIABLE x.L  selected jobs

job12 1,    job15 1,    job19 1

----     44 VARIABLE uncov.L  uncovered hours

h00 1,    h23 1

----     44 VARIABLE tcost.L               =         2170  total cost (objective)
PARAMETER jobcost              =          172  cost related to running jobs
PARAMETER penalties            =         1998  for uncovered hours


I use this type of elastic formulations a lot. This approach prevents a user from seeing "sorry model was infeasible". Instead, it gives back a meaningful result.

#### References

1. Profit maximising job scheduling using Python, https://stackoverflow.com/questions/62297792/profit-maximising-job-scheduling-using-python
2. Garfinkel, R. S., and G. L. Nemhauser. “The Set-Partitioning Problem: Set Covering with Equality Constraints.” Operations Research, vol. 17, no. 5, 1969, pp. 848–856.