Problem Statement
The problem we are trying to solve here is a simple scheduling model [1]. But as we shall see, it has some interesting performance issues.
There are \(N\) tasks (jobs) and \(M\) rooms. We need to assign jobs to rooms where they are executed. We assume we execute one job at the time in each room (but jobs in different rooms can execute concurrently). Jobs require some resources (for example water tap, gas piping). Rooms provide certain resources. We need to make sure tasks are assigned to rooms that contain the resources that are needed. Finally, all jobs have a due date.
Data
Let's generate some data.
---- 33 SET use resource usage resource1 resource2 resource3 resource4 resource5 task2 YES task3 YES task5 YES task7 YES task9 YES YES task11 YES task12 YES YES task13 YES task14 YES task15 YES task16 YES YES task17 YES task20 YES YES task21 YES YES task23 YES task24 YES task25 YES YES task26 YES task28 YES ---- 33 SET avail resource availability resource1 resource2 resource3 resource4 resource5 room1 YES YES YES YES room2 YES YES room3 YES YES room4 YES YES YES YES room5 YES YES YES YES ---- 33 PARAMETER length job length task1 2.335, task2 4.935, task3 4.066, task4 1.440, task5 4.979, task6 3.321, task7 1.666 task8 3.573, task9 2.377, task10 4.649, task11 4.600, task12 1.065, task13 2.475, task14 3.658 task15 3.374, task16 1.138, task17 4.367, task18 4.728, task19 3.032, task20 2.198, task21 2.986 task22 1.180, task23 4.095, task24 3.132, task25 3.987, task26 3.880, task27 3.526, task28 1.460 task29 4.885, task30 3.827 ---- 33 PARAMETER due job due dates task1 5.166, task2 5.333, task3 5.493, task4 5.540, task5 6.226, task6 8.105 task7 8.271, task8 8.556, task9 8.677, task10 8.922, task11 10.184, task12 11.711 task13 11.975, task14 12.814, task15 12.867, task16 14.023, task17 14.200, task18 15.820 task19 15.877, task20 16.156, task21 16.438, task22 16.885, task23 17.033, task24 17.813 task25 21.109, task26 21.713, task27 23.655, task28 23.977, task29 24.014, task30 24.507
We have 30 tasks, 5 rooms, and 5 resources. Note that some tasks don't need special resources (e.g. task1). They can execute in any room. Some jobs require resources that allow only one room. For instance, task9 needs resources 2 and 4. Only room1 provides this combination.
In the model, we actually don't need to know about resource usage. The only thing we need to know is whether job \(i\) can be assigned to room \(j\). So I calculated a set Allowed:
---- 37 SET allowed task is allowed to be executed in room
room1 room2 room3 room4 room5
task1 YES YES YES YES YES
task2 YES YES YES YES
task3 YES YES YES YES
task4 YES YES YES YES YES
task5 YES YES YES YES
task6 YES YES YES YES YES
task7 YES YES
task8 YES YES YES YES YES
task9 YES
task10 YES YES YES YES YES
task11 YES YES YES YES
task12 YES
task13 YES YES
task14 YES YES
task15 YES YES YES YES
task16 YES YES YES
task17 YES YES
task18 YES YES YES YES YES
task19 YES YES YES YES YES
task20 YES
task21 YES
task22 YES YES YES YES YES
task23 YES YES
task24 YES YES YES YES
task25 YES YES
task26 YES YES YES YES
task27 YES YES YES YES YES
task28 YES YES YES YES
task29 YES YES YES YES YES
task30 YES YES YES YES YES
Model 1
My first approach is to use no-overlap constraints for jobs that are assigned to the same room.
Mixed Integer Programming Model 1 |
---|
\[ \begin{align} \min\>&\color{darkred}{\mathit{Makespan}}\\ &\sum_{j|\color{darkblue}{\mathit{Allowed}}(i,j)} \color{darkred}{\mathit{Assign}}_{i,j} = 1 && \forall i \\ &\color{darkred}{\mathit{Finish}}_{i} = \color{darkred}{\mathit{Start}}_{i} + \color{darkblue}{\mathit{Length}}_{i} && \forall i \\ &\color{darkred}{\mathit{Start}}_{i} \ge \color{darkred}{\mathit{Finish}}_{i'} - \color{darkblue}M \cdot\color{darkred}\delta_{i,i',j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i',j}) && \forall i \lt i',j | \color{darkblue}{\mathit{Allowed}}(i,j) \>\mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j)\\ &\color{darkred}{\mathit{Start}}_{i'} \ge \color{darkred}{\mathit{Finish}}_{i} - \color{darkblue}M (1-\color{darkred}\delta_{i,i',j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j}) - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i',j}) &&\forall i \lt i',j | \color{darkblue}{\mathit{Allowed}}(i,j) \>\mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j) \\ &\color{darkred}{\mathit{Finish}}_i \le \color{darkblue}{\mathit{DueDate}}_{i} && \forall i\\ &\color{darkred}{\mathit{Makespan}} \ge \color{darkred}{\mathit{Finish}}_{i} && \forall i \\ & \color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{Start}}_{i},\color{darkred}{\mathit{Finish}}_{i} \ge 0\\ & \color{darkred}\delta_{i,i',j} \in \{0,1\} \end{align} \] |
The no-overlap constraints say, if jobs \(i\) and \(i'\) execute in the same room \(j\) then either \(i\) has to execute before \(i'\) or \(i'\) has to execute before \(i\). As usual, the big-M values are used to make constraints non-binding when they are not to be obeyed. For this problem, I simply used \(M=100\).
This model does not perform very well at all. After an hour, I still saw a gap of 27% In addition, the number of constraints is large (given the small data set): 2,352.
We can improve on this formulation by observing that we don't need all variables \(\delta_{i,i',j}\). Instead we can use \(\delta_{i,i'}\). This improves the performance a bit, but it is still not very good. This version is called "Improved Model 1" in the results table further down.
Model 2
Let's try a different model. First, we make sure all jobs are ordered by the due date. This means that we can find the finish time for job \(i\) placed in room \(j\) by calculating the processing time of all previous jobs assigned to room \(j\): \[\sum_{i'|i'\le i\>\mathbf{and} \mathit{Allowed}(i',j)} {\mathit{Length}}_{i'} \cdot {\mathit{Assign}}_{i',j} \] This means: we execute jobs assigned to a room back-to-back (no holes). Using this approach we can write:
Mixed Integer Programming Model 2 |
---|
\[ \begin{align} \min\>&\color{darkred}{\mathit{Makespan}}\\ &\color{darkred}{\mathit{Finish}}_{i} \ge \sum_{i'|i'\le i\> \mathbf{and}\>\color{darkblue}{\mathit{Allowed}}(i',j)} \color{darkblue}{\mathit{Length}}_{i'} \cdot \color{darkred}{\mathit{Assign}}_{i',j} - \color{darkblue}M (1-\color{darkred}{\mathit{Assign}}_{i,j})&& \forall i,j|\color{darkblue}{\mathit{Allowed}}(i,j)\\ &\color{darkred}{\mathit{Finish}}_i \le \color{darkblue}{\mathit{DueDate}}_{i} && \forall i\\ &\sum_{j|\color{darkblue}{\mathit{Allowed}}(i,j)} \color{darkred}{\mathit{Assign}}_{i,j} = 1 && \forall i \\ &\color{darkred}{\mathit{Makespan}} \ge \color{darkred}{\mathit{Finish}}_{i} && \forall i \\ & \color{darkred}{\mathit{Assign}}_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{Finish}}_{i} \ge 0\\ \end{align} \] |
Again we rely heavily on the set Allowed.
Note that in the finish constraint we repeat large parts of the summation in subsequent constraints. For large models, we may want to look into this (e.g. by adding extra variables and constraints to reduce the number of nonzero elements). In this case, I just left the constraint as specified in the mathematical model.
This model 2 turns out to be much faster:
Model 1 | Improved Model 1 | Model 2 | |
---|---|---|---|
Rows | 2,352 | 2,352 | 286 |
Columns | 1,283 | 585 | 137 |
Binary Columns | 1,222 | 524 | 106 |
Time | >3,600 | >3,600 | 24 |
Status | Time Limit | Time Limit | Optimal |
Objective | 19.5142 | 19.4359 | 19.4322 |
Gap | 27% | 22% | 0% |
The difference between models 1 and 2 is rather dramatic.
Solution
---- 129 PARAMETER results start length finish duedate room1.task1 2.335 2.335 5.166 room1.task9 2.335 2.377 4.712 8.677 room1.task11 4.712 4.600 9.312 10.184 room1.task20 9.312 2.198 11.510 16.156 room1.task23 11.510 4.095 15.605 17.033 room1.task30 15.605 3.827 19.432 24.507 room2.task6 3.321 3.321 8.105 room2.task10 3.321 4.649 7.971 8.922 room2.task15 7.971 3.374 11.344 12.867 room2.task24 11.344 3.132 14.476 17.813 room2.task29 14.476 4.885 19.361 24.014 room3.task2 4.935 4.935 5.333 room3.task8 4.935 3.573 8.508 8.556 room3.task18 8.508 4.728 13.237 15.820 room3.task22 13.237 1.180 14.416 16.885 room3.task27 14.416 3.526 17.943 23.655 room3.task28 17.943 1.460 19.403 23.977 room4.task3 4.066 4.066 5.493 room4.task4 4.066 1.440 5.506 5.540 room4.task13 5.506 2.475 7.981 11.975 room4.task17 7.981 4.367 12.348 14.200 room4.task21 12.348 2.986 15.335 16.438 room4.task25 15.335 3.987 19.322 21.109 room5.task5 4.979 4.979 6.226 room5.task7 4.979 1.666 6.645 8.271 room5.task12 6.645 1.065 7.710 11.711 room5.task14 7.710 3.658 11.367 12.814 room5.task16 11.367 1.138 12.506 14.023 room5.task19 12.506 3.032 15.538 15.877 room5.task26 15.538 3.880 19.418 21.713
The highlighted completion time corresponds to the makespan.
Conclusion
This is a bit more interesting model than I initially thought.
The standard no-overlap constraints for a continuous-time model do not perform very well. In addition, they are a bit complicated due to several big-M terms (the constraints should only hold in certain specific cases: two jobs run in the same room).
By using the assumption that jobs in a single room are executed in order of the due date (not a bad assumption), we can create a much simpler MIP model that also performs much, much better.
When we have an infeasible solution (i.e. we cannot meet all due dates), we may want to deliver a good schedule that minimizes the damage. This is easy is we just add a slack variable to each due date constraint, and add the slack with a cost (penalty) to the objective. This essentially minimized the sum of the due date violations. There may be a reason to also look at minimizing the number of tardy jobs. It is noted that because we fixed the order of jobs in a single room, we may not get the best if we want to minimize the number of tardy jobs.
References
- Allocating and scheduling tasks into rooms with conditions - optimization algorithm, https://stackoverflow.com/questions/61656492/allocating-and-scheduling-tasks-into-rooms-with-conditions-optimization-algori
No comments:
Post a Comment