## Friday, May 8, 2020

### A scheduling problem

#### Problem Statement

The problem we are trying to solve here is a simple scheduling model [1]. But as we shall see, it has some interesting performance issues.

There are $$N$$ tasks (jobs) and $$M$$ rooms. We need to assign jobs to rooms where they are executed. We assume we execute one job at the time in each room (but jobs in different rooms can execute concurrently). Jobs require some resources (for example water tap, gas piping). Rooms provide certain resources. We need to make sure tasks are assigned to rooms that contain the resources that are needed. Finally, all jobs have a due date.

#### Data

Let's generate some data.

----     33 SET use  resource usage

resource1   resource2   resource3   resource4   resource5

task2                                              YES
task3                                                          YES
task5                                  YES
task7          YES
task9                      YES                     YES
task11                                                         YES
task12         YES                                 YES
task13         YES
task14         YES
task15                                 YES
task16                                 YES                     YES
task17                     YES
task20                     YES                     YES
task21         YES         YES
task23                     YES
task24                                             YES
task25         YES                                             YES
task26                                 YES
task28                                                         YES

----     33 SET avail  resource availability

resource1   resource2   resource3   resource4   resource5

room1                     YES         YES         YES         YES
room2                                 YES         YES
room3                                             YES         YES
room4         YES         YES         YES                     YES
room5         YES                     YES         YES         YES

----     33 PARAMETER length  job length

task1  2.335,    task2  4.935,    task3  4.066,    task4  1.440,    task5  4.979,    task6  3.321,    task7  1.666
task8  3.573,    task9  2.377,    task10 4.649,    task11 4.600,    task12 1.065,    task13 2.475,    task14 3.658
task15 3.374,    task16 1.138,    task17 4.367,    task18 4.728,    task19 3.032,    task20 2.198,    task21 2.986
task22 1.180,    task23 4.095,    task24 3.132,    task25 3.987,    task26 3.880,    task27 3.526,    task28 1.460
task29 4.885,    task30 3.827

----     33 PARAMETER due  job due dates

task1   5.166,    task2   5.333,    task3   5.493,    task4   5.540,    task5   6.226,    task6   8.105
task7   8.271,    task8   8.556,    task9   8.677,    task10  8.922,    task11 10.184,    task12 11.711
task13 11.975,    task14 12.814,    task15 12.867,    task16 14.023,    task17 14.200,    task18 15.820
task19 15.877,    task20 16.156,    task21 16.438,    task22 16.885,    task23 17.033,    task24 17.813
task25 21.109,    task26 21.713,    task27 23.655,    task28 23.977,    task29 24.014,    task30 24.507


We have 30 tasks, 5 rooms, and 5 resources. Note that some tasks don't need special resources (e.g. task1). They can execute in any room. Some jobs require resources that allow only one room. For instance, task9 needs resources 2 and 4. Only room1 provides this combination.

In the model, we actually don't need to know about resource usage. The only thing we need to know is whether job $$i$$ can be assigned to room $$j$$. So I calculated a set Allowed:

----     37 SET allowed  task is allowed to be executed in room

room1       room2       room3       room4       room5

task1          YES         YES         YES         YES         YES
task2          YES         YES         YES                     YES
task3          YES                     YES         YES         YES
task4          YES         YES         YES         YES         YES
task5          YES         YES                     YES         YES
task6          YES         YES         YES         YES         YES
task7                                              YES         YES
task8          YES         YES         YES         YES         YES
task9          YES
task10         YES         YES         YES         YES         YES
task11         YES                     YES         YES         YES
task12                                                         YES
task13                                             YES         YES
task14                                             YES         YES
task15         YES         YES                     YES         YES
task16         YES                                 YES         YES
task17         YES                                 YES
task18         YES         YES         YES         YES         YES
task19         YES         YES         YES         YES         YES
task20         YES
task21                                             YES
task22         YES         YES         YES         YES         YES
task23         YES                                 YES
task24         YES         YES         YES                     YES
task25                                             YES         YES
task26         YES         YES                     YES         YES
task27         YES         YES         YES         YES         YES
task28         YES                     YES         YES         YES
task29         YES         YES         YES         YES         YES
task30         YES         YES         YES         YES         YES


#### Model 1

My first approach is to use no-overlap constraints for jobs that are assigned to the same room.

Mixed Integer Programming Model 1
\begin{align} \min\>&\color{darkred}{\mathit{Makespan}}\\ &\sum_{j|\color{darkblue}{\mathit{Allowed}}(i,j)} \color{darkred}{\mathit{Assign}}_{i,j} = 1 && \forall i \\ &\color{darkred}{\mathit{Finish}}_{i} = \color{darkred}{\mathit{Start}}_{i} + \color{darkblue}{\mathit{Length}}_{i} && \forall i \\ &\color{darkred}{\mathit{Start}}_{i} \ge \color{darkred}{\mathit{Finish}}_{i'} - \color{darkblue}M \cdot\color{darkred}\delta_{i,i',j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i',j}) && \forall i \lt i',j | \color{darkblue}{\mathit{Allowed}}(i,j) \>\mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j)\\ &\color{darkred}{\mathit{Start}}_{i'} \ge \color{darkred}{\mathit{Finish}}_{i} - \color{darkblue}M (1-\color{darkred}\delta_{i,i',j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i',j}) &&\forall i \lt i',j | \color{darkblue}{\mathit{Allowed}}(i,j) \>\mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j) \\ &\color{darkred}{\mathit{Finish}}_i \le \color{darkblue}{\mathit{DueDate}}_{i} && \forall i\\ &\color{darkred}{\mathit{Makespan}} \ge \color{darkred}{\mathit{Finish}}_{i} && \forall i \\ & \color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{Start}}_{i},\color{darkred}{\mathit{Finish}}_{i} \ge 0\\ & \color{darkred}\delta_{i,i',j} \in \{0,1\} \end{align}

The no-overlap constraints say, if jobs $$i$$ and $$i'$$ execute in the same room $$j$$ then either $$i$$ has to execute before $$i'$$ or $$i'$$ has to execute before $$i$$. As usual, the big-M values are used to make constraints non-binding when they are not to be obeyed. For this problem, I simply used $$M=100$$.

This model does not perform very well at all. After an hour, I still saw a gap of 27% In addition, the number of constraints is large (given the small data set): 2,352.

We can improve on this formulation by observing that we don't need all variables $$\delta_{i,i',j}$$. Instead we can use  $$\delta_{i,i'}$$. This improves the performance a bit, but it is still not very good. This version is called "Improved Model 1" in the results table further down.

#### Model 2

Let's try a different model. First, we make sure all jobs are ordered by the due date. This means that we can find the finish time for job $$i$$ placed in room $$j$$ by calculating the processing time of all previous jobs assigned to room $$j$$: $\sum_{i'|i'\le i\>\mathbf{and} \mathit{Allowed}(i',j)} {\mathit{Length}}_{i'} \cdot {\mathit{Assign}}_{i',j}$ This means: we execute jobs assigned to a room back-to-back (no holes). Using this approach we can write:

Mixed Integer Programming Model 2
\begin{align} \min\>&\color{darkred}{\mathit{Makespan}}\\ &\color{darkred}{\mathit{Finish}}_{i} \ge \sum_{i'|i'\le i\> \mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j)} \color{darkblue}{\mathit{Length}}_{i'} \cdot \color{darkred}{\mathit{Assign}}_{i',j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j})&& \forall i,j|\color{darkblue}{\mathit{Allowed}}(i,j)\\ &\color{darkred}{\mathit{Finish}}_i \le \color{darkblue}{\mathit{DueDate}}_{i} && \forall i\\ &\sum_{j|\color{darkblue}{\mathit{Allowed}}(i,j)} \color{darkred}{\mathit{Assign}}_{i,j} = 1 && \forall i \\ &\color{darkred}{\mathit{Makespan}} \ge \color{darkred}{\mathit{Finish}}_{i} && \forall i \\ & \color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{Finish}}_{i} \ge 0\\ \end{align}

Again we rely heavily on the set Allowed.

Note that in the finish constraint we repeat large parts of the summation in subsequent constraints. For large models, we may want to look into this (e.g. by adding extra variables and constraints to reduce the number of nonzero elements). In this case, I just left the constraint as specified in the mathematical model.

This model 2 turns out to be much faster:

Model 1Improved Model 1Model 2
Rows2,3522,352286
Columns1,283585137
Binary Columns1,222524106
Time>3,600>3,60024
StatusTime LimitTime LimitOptimal
Objective19.514219.435919.4322
Gap27%22%0%

The difference between models 1 and 2 is rather dramatic.

#### Solution

The solution values for the variables $$\mathit{Finish}_i$$ are not necessarily as small as possible. The objective does not push all job completion times down, only the ones involved with the total makespan (i.e. on the critical path). When reporting it makes sense just to take the optimal assignments from the model and then execute jobs as early as possible. This is what I did here. Jobs on a single line are ordered by the due date. The recalculated solution is:

----    129 PARAMETER results

start      length      finish     duedate

room1.task1                    2.335       2.335       5.166
room1.task9        2.335       2.377       4.712       8.677
room1.task11       4.712       4.600       9.312      10.184
room1.task20       9.312       2.198      11.510      16.156
room1.task23      11.510       4.095      15.605      17.033
room1.task30      15.605       3.827      19.432      24.507
room2.task6                    3.321       3.321       8.105
room2.task10       3.321       4.649       7.971       8.922
room2.task15       7.971       3.374      11.344      12.867
room2.task24      11.344       3.132      14.476      17.813
room2.task29      14.476       4.885      19.361      24.014
room3.task2                    4.935       4.935       5.333
room3.task8        4.935       3.573       8.508       8.556
room3.task18       8.508       4.728      13.237      15.820
room3.task22      13.237       1.180      14.416      16.885
room3.task27      14.416       3.526      17.943      23.655
room3.task28      17.943       1.460      19.403      23.977
room4.task3                    4.066       4.066       5.493
room4.task4        4.066       1.440       5.506       5.540
room4.task13       5.506       2.475       7.981      11.975
room4.task17       7.981       4.367      12.348      14.200
room4.task21      12.348       2.986      15.335      16.438
room4.task25      15.335       3.987      19.322      21.109
room5.task5                    4.979       4.979       6.226
room5.task7        4.979       1.666       6.645       8.271
room5.task12       6.645       1.065       7.710      11.711
room5.task14       7.710       3.658      11.367      12.814
room5.task16      11.367       1.138      12.506      14.023
room5.task19      12.506       3.032      15.538      15.877
room5.task26      15.538       3.880      19.418      21.713


The highlighted completion time corresponds to the makespan.

#### Conclusion

This is a bit more interesting model than I initially thought.

The standard no-overlap constraints for a continuous-time model do not perform very well. In addition, they are a bit complicated due to several big-M terms (the constraints should only hold in certain specific cases: two jobs run in the same room).

By using the assumption that jobs in a single room are executed in order of the due date (not a bad assumption), we can create a much simpler MIP model that also performs much, much better.

When we have an infeasible solution (i.e. we cannot meet all due dates), we may want to deliver a good schedule that minimizes the damage. This is easy is we just add a slack variable to each due date constraint, and add the slack with a cost (penalty) to the objective. This essentially minimized the sum of the due date violations. There may be a reason to also look at minimizing the number of tardy jobs. It is noted that because we fixed the order of jobs in a single room, we may not get the best if we want to minimize the number of tardy jobs.

#### References

1. Allocating and scheduling tasks into rooms with conditions - optimization algorithm, https://stackoverflow.com/questions/61656492/allocating-and-scheduling-tasks-into-rooms-with-conditions-optimization-algori