In [1] two formulations were discussed for a scheduling problem with multiple machines. Here we add a few more. Some of them are a bit strange. None of them really works better than the ones in [1].
So this is a collection of formulations that may sound reasonable at first, but are really not suited for this problem. If you want to read about good models, skip this post.
Summary of the input data
---- 29 PARAMETER jobdata proctime release duedate weights job1 4 61 70 1 job2 9 59 78 1 job3 7 67 1 job4 5 21 35 1 job5 5 6 17 1 job6 4 19 1 job7 5 57 68 1 job8 9 41 59 1 job9 3 12 1 job10 7 67 82 1 job11 10 16 1 job12 7 9 25 1 job13 10 40 57 1 job14 9 58 78 1 job15 4 26 1 job16 8 13 1 job17 4 63 1 job18 5 66 1 job19 8 45 1 job20 6 25 42 1 job21 5 32 1 job22 5 32 1 job23 4 5 21 1 job24 4 94 1 job25 7 22 44 1 job26 9 60 81 1 job27 4 37 1 job28 8 21 1 job29 9 61 78 1 job30 5 4 16 1 job31 3 28 1 job32 7 10 1 job33 4 24 34 1 job34 9 39 55 1 job35 5 23 1 job36 5 25 1 job37 7 24 40 1 job38 8 38 1 job39 8 39 1 job40 6 97 1 job41 6 100 1 job42 3 33 43 1 job43 5 28 43 1 job44 3 64 80 1 job45 5 31 46 1 job46 4 93 1 job47 8 2 20 1 job48 7 59 76 1 job49 9 15 1 job50 5 51 62 1 ---- 33 SET precedence precedence constraints: i must execute before j job4 job8 job9 job10 job11 job13 job16 job20 job33 job1 YES job4 YES job6 YES job7 YES job8 YES YES YES job11 YES job15 YES job18 YES job28 YES job30 YES + job36 job37 job44 job33 YES YES job40 YES
- A time-indexed model that used the start of a job as central variable: \[\color{darkred}x_{j,m,t} = \begin{cases} 1 & \text{if job $j$ starts at time $t$ on machine $m$}\\ 0 & \text{otherwise}\end{cases}\] This model did very well.
- A continuous-time model, where we model no-overlap using standard big-M constraints based on the Alan Manne formulation to sequence jobs such that they don't overlap [2]. This model did not do as well. It found good solutions fairly quickly but finding and proving the global optimum was challenging.
An optimal solution |
- M3: A version of a time-indexed model that can be used when the run length of a job is not known in advance. That more general approach leads to a slower model.
- M4: A version of a continuous-time model from [3]. It is more complicated but not better than the model I used in [1].
- M5: A positional formulation. It works, but it is not competitive.
M3: A slow time-indexed model
Relation between x, xstart and xend. |
It is noted that \(\color{darkred}x_{j,m,t}\) is a real binary variable. The variables \(\color{darkred}{\mathit{xstart}}_{j,m,t}\) and \(\color{darkred}{\mathit{xend}}_{j,m,t}\) can be continuous between 0 and 1.
Model M3: Slow time indexed model |
---|
\[\begin{align} \min&\sum_{\mathit{objs}} \color{darkblue}{\mathit{objw}}_{\mathit{objs}} \cdot \color{darkred}{\mathit{obj}}_{\mathit{objs}} && \\ \hline & \color{darkred}{\mathit{obj}}_{\mathit{completion}} = \sum_j \color{darkblue}w_j \cdot \color{darkred}{\mathit{completion}}_j && \\ & \color{darkred}{\mathit{obj}}_{\mathit{sumtardy}} = \sum_j \color{darkblue}w_j \cdot \color{darkred}{\mathit{tardy}}_j \\ & \color{darkred}{\mathit{obj}}_{\mathit{numtardy}} = \sum_{j,m,t|t\ge\color{darkblue}{\mathit{due}}(j)}\color{darkred}{\mathit{xend}}_{j,m,t} \\ & \color{darkred}{\mathit{obj}}_{\mathit{maxtardy}} \ge \color{darkred}{\mathit{tardy}}_j && \forall j \\ & \color{darkred}{\mathit{obj}}_{\mathit{makespan}} \ge \color{darkred}{\mathit{completion}}_j && \forall j\\ \hline& \color{darkred}{\mathit{xstart}}_{j,m,t}\ge\color{darkred}x_{j,m,t} - \color{darkred}x_{j,m,t-1} \\ & \color{darkred}{\mathit{xend}}_{j,m,t}\ge\color{darkred}x_{j,m,t} - \color{darkred}x_{j,m,t+1} \\ & \sum_{m,t} \color{darkred}{\mathit{xstart}}_{j,m,t} \le 1&&\forall j \\ & \sum_{m,t} \color{darkred}{\mathit{xend}}_{j,m,t} \le 1&&\forall j \\ &\sum_{m,t}\color{darkred}x_{j,m,t} = \color{darkblue}{\mathit{proctime}}_j && \forall j \\ &\sum_j \color{darkred}x_{j,m,t} \le 1 && \forall m,t\\ & \color{darkred}{\mathit{start}}_j = \sum_{m,t} t \cdot \color{darkred}{\mathit{xstart}}_{j,m,t} && \forall j \\ & \color{darkred}{\mathit{completion}}_j = \color{darkred}{\mathit{start}}_j +\color{darkblue}{\mathit{proctime}}_j && \forall j\\ &\color{darkred}{\mathit{tardy}}_j \ge\color{darkred}{\mathit{completion}}_j -\color{darkblue}{\mathit{due}}_j&& \forall j\\ &\color{darkred}{\mathit{completion}}_i \le\color{darkred}{\mathit{start}}_j && \forall i,j|\color{darkblue}{\mathit{precedence}}_{i,j} && \\ \hline & \color{darkred}x_{j,m,t} \in \{0,1\} && \\ &\color{darkred}{\mathit{xstart}}_{j,m,t}, \color{darkred}{\mathit{xend}}_{j,m,t} \in [0,1]\\ & \color{darkred}{\mathit{obj}}_{\mathit{objs}} \ge 0 \\ & \color{darkred}{\mathit{tardy}}_j \ge 0 \\ &\color{darkred}{\mathit{start}}_j \ge \color{darkblue}{\mathit{release}}_j \end{align}\] |
model m1 time-indexed | model m2 continuous-time | model m3 time-indexed | |
---|---|---|---|
Equations | 626 | 10,026 | 40,726 |
Variables | 20,158 | 1,633 | 60,158 |
binary | 20,000 | 1,475 | 20,000 |
Nonzero elements | 191,024 | 49,896 | 250,752 |
Objective | 324.096 | 324.102 | 437.303 |
completion | 2,096 | 2,102 | 2,303 |
sumtardy | 322 | 322 | 435 |
maxtardy | 84 | 84 | 87 |
numtardy | 7 | 7 | 14 |
makespan | 97 | 97 | 100 |
Time (seconds) | 20 | 1800 (time limit) | 1800 (time limit) |
Gap | optimal | 0.13% | 30% |
Nodes | 529 | 1,262,868 | 5,317 |
Iterations | 28,553 | 5,463,876 | 13,311,418 |
The conclusion is that relatively minor variations in a model can lead to very different performance. Also, a formulation that exploits as much as possible of the problem at hand (in this case: fixed run lengths), is likely the best.
M4: an alternative, complex continuous-time model
if jobs \(i\) and \(j\) execute on the same machine then job \(j\) ends before job \(i\) starts or job \(j\) starts after job \(i\) ends.
Model M4: Alternative continuous-time model |
---|
\[\begin{align}\min&\sum_{\mathit{objs}} \color{darkblue}{\mathit{objw}}_{\mathit{objs}} \cdot \color{darkred}{\mathit{obj}}_{\mathit{objs}} && \\ \hline & \color{darkred}{\mathit{obj}}_{\mathit{completion}} = \sum_j \color{darkblue}w_j \cdot \color{darkred}{\mathit{completion}}_j \\ & \color{darkred}{\mathit{obj}}_{\mathit{sumtardy}} = \sum_j \color{darkblue}w_j \cdot \color{darkred}{\mathit{tardy}}_j \\ & \color{darkred}{\mathit{obj}}_{\mathit{numtardy}} = \sum_j \color{darkred}{\mathit{isTardy}}_j \\ & \color{darkred}{\mathit{obj}}_{\mathit{maxtardy}} \ge \color{darkred}{\mathit{tardy}}_j && \forall j\\ & \color{darkred}{\mathit{obj}}_{\mathit{makespan}} \ge \color{darkred}{\mathit{completion}}_j && \forall j\\ \hline &\color{darkred}\delta_{i,j} + \color{darkred}\delta_{j,i}+\color{darkred}d_{i,j} = 1&& \forall i \lt j \\ &\color{darkred}\delta_{i,j} +\color{darkred}\delta_{j,k} + \color{darkred}\delta_{k,i}\le 2 &&\forall i \lt j \lt k\\ & \color{darkred}x_{i,m}+\color{darkred}x_{j,m}+\color{darkred}d_{i,j}\le 2 && \forall i\lt j \\ & \sum_m \color{darkred}x_{j,m}=1 && \forall j \\ & \color{darkred}{\mathit{completion}}_ j = \color{darkred}{\mathit{start}}_ j + \color{darkblue}{\mathit{proctime}}_ j && \forall j \\ & \color{darkred}{\mathit{start}}_j \ge \color{darkred}{\mathit{completion}}_i - \color{darkblue}M(1-\color{darkred}\delta_{i,j})&&\forall i\ne j, m \\ & \color{darkred}{\mathit{tardy}}_j \ge\color{darkred}{\mathit{completion}}_j -\color{darkblue}{\mathit{due}}_j && \forall j\\ &\color{darkred}{\mathit{tardy}}_j \le \color{darkblue}M \cdot\color{darkred}{\mathit{isTardy}}_j&& \forall j \\ &\color{darkred}{\mathit{completion}}_i \le\color{darkred}{\mathit{start}}_j && \forall i,j|\color{darkblue}{\mathit{precedence}}_{i,j} \\ \hline & \color{darkred}x_{j,m}\in \{0,1\} \\ & \color{darkred}\delta_{i,j} \in \{0,1\} \\ & \color{darkred}d_{i,j} \in [0,1]\\ &\color{darkred}{\mathit{isTardy}}_j\in \{0,1\} \\ & \color{darkred}{\mathit{obj}}_{\mathit{objs}} \ge 0 \\ & \color{darkred}{\mathit{tardy}}_j \ge 0 \\ & \color{darkred}{\mathit{start}}_j \ge \max(1,\color{darkblue}{\mathit{release}}_j) \end{align}\] |
The performance is as follows:
model m1 time-indexed | model m2 continuous-time | model m4 continuous-time | |
---|---|---|---|
Equations | 626 | 10,026 | 28,451 |
Variables | 20,158 | 1,633 | 4,133 |
binary | 20,000 | 1,475 | 2,750 |
Nonzero elements | 191,024 | 49,896 | 85,571 |
Objective | 324.096 | 324.102 | 324.139 |
completion | 2,096 | 2,102 | 2,139 |
sumtardy | 322 | 322 | 322 |
maxtardy | 84 | 84 | 84 |
numtardy | 7 | 7 | 7 |
makespan | 97 | 97 | 97 |
Time (seconds) | 20 | 1800 (time limit) | 1800 (time limit) |
Gap | optimal | 0.13% | 0.15% |
Nodes | 529 | 1,262,868 | 114,732 |
Iterations | 28,553 | 5,463,876 | 3,455,261 |
- The performance is very close to model M2. This formulation does not add much value above the more intuitive and simpler model M2. The M2 model is easier to remember. More complex models are not necessarily better.
- I don't know what the purpose is of the transitivity constraint.
- Some of the constraints should have \(\forall i\ne j\) instead of \(\forall i,j\).
- Of course, we can add ordering constraints to reduce symmetry. I used: machine \(m\) has more jobs (or the same number) than machine \(m+1\). As the machines in my model are identical, we can renumber machines.
M5: positional formulation.
Model M5: Positional model |
---|
\[\begin{align} \min&\sum_{\mathit{objs}} \color{darkblue}{\mathit{objw}}_{\mathit{objs}} \cdot \color{darkred}{\mathit{obj}}_{\mathit{objs}} && \\ \hline & \color{darkred}{\mathit{obj}}_{\mathit{completion}} = \sum_j \color{darkblue}w_j \cdot \color{darkred}{\mathit{completion}}_j \\ & \color{darkred}{\mathit{obj}}_{\mathit{sumtardy}} = \sum_j \color{darkblue}w_j \cdot \color{darkred}{\mathit{tardy}}_j \\ & \color{darkred}{\mathit{obj}}_{\mathit{numtardy}} = \sum_j \color{darkred}{\mathit{isTardy}}_j \\ & \color{darkred}{\mathit{obj}}_{\mathit{maxtardy}} \ge \color{darkred}{\mathit{tardy}}_j && \forall j\\ & \color{darkred}{\mathit{obj}}_{\mathit{makespan}} \ge \color{darkred}{\mathit{completion}}_j && \forall j\\ \hline &\sum_{m,p}\color{darkred}x_{j,m,p}=1 &&\forall j \\ & \sum_j\color{darkred}x_{j,m,p}\le 1 &&\forall m,p \\ & \color{darkred}f_{m,p} \ge \color{darkred}f_{m,p-1} + \sum_j \color{darkblue}{\mathit{proctime}}_j\cdot \color{darkred}x_{j,m,p} &&\forall m,p\\ & \color{darkred}f_{m,0}=1 \\ & \color{darkred}f_{m,p} \ge \sum_j (\color{darkblue}{\mathit{release}}_j+\color{darkblue}{\mathit{proctime}}_j)\cdot \color{darkred}x_{j,m,p} && \forall m,p \\ & \color{darkred}f_{m,p} \ge \color{darkred}{\mathit{completion}}_i + \color{darkblue}{\mathit{proctime}}_j\cdot \color{darkred}x_{j,m,p} - \color{darkblue}M\cdot (1-\color{darkred}x_{j,m,p})&& \forall i,j|\color{darkblue}{\mathit{precedence}}_{i,j},m,p \\ & \color{darkred}{\mathit{completion}}_j \ge \color{darkred}f_{m,p} - \color{darkblue}M(1-\color{darkred}x_{j,m,p}) && \forall j,m,p \\ & \color{darkred}{\mathit{completion}}_ j = \color{darkred}{\mathit{start}}_ j + \color{darkblue}{\mathit{proctime}}_ j && \forall j \\ & \color{darkred}{\mathit{tardy}}_j \ge\color{darkred}{\mathit{completion}}_j -\color{darkblue}{\mathit{due}}_j && \forall j\\ &\color{darkred}{\mathit{tardy}}_j \le \color{darkblue}M \cdot\color{darkred}{\mathit{isTardy}}_j&& \forall j \\ &\color{darkred}{\mathit{completion}}_i \le\color{darkred}{\mathit{start}}_j && \forall i,j|\color{darkblue}{\mathit{precedence}}_{i,j} \\ \hline & \color{darkred}x_{j,m,p}\in \{0,1\} \\ & \color{darkred}f_{m,p} \ge 1 \\ &\color{darkred}{\mathit{isTardy}}_j\in \{0,1\} \\ & \color{darkred}{\mathit{obj}}_{\mathit{objs}} \ge 0 \\ & \color{darkred}{\mathit{tardy}}_j \ge 0 \\ & \color{darkred}{\mathit{start}}_j \ge \max(1,\color{darkblue}{\mathit{release}}_j) \end{align}\] |
model m1 time-indexed | model m2 continuous-time | model m5 positional formulation | |
---|---|---|---|
Equations | 626 | 10,026 | 5,742 |
Variables | 20,158 | 1,633 | 4,368 |
binary | 20,000 | 1,475 | 4,050 |
Nonzero elements | 191,024 | 49,896 | 36,564 |
Objective | 324.096 | 324.102 | 324.135 |
completion | 2,096 | 2,102 | 2,135 |
sumtardy | 322 | 322 | 322 |
maxtardy | 84 | 84 | 84 |
numtardy | 7 | 7 | 7 |
makespan | 97 | 97 | 97 |
Time (seconds) | 20 | 1800 (time limit) | 1800 (time limit) |
Gap | optimal | 0.13% | 0.14% |
Nodes | 529 | 1,262,868 | 49,186 |
Iterations | 28,553 | 5,463,876 | 2,368,537 |
- This model is not as effective as the models in [1].
- There are some interesting possibilities to reduce symmetry in this model.
Conclusion
References
- Parallel machine scheduling I, two formulations, https://yetanothermathprogrammingconsultant.blogspot.com/2021/03/parallel-machine-scheduling-i-two.html
- Manne, Alan S. “On the Job-Shop Scheduling Problem.” Operations Research, vol. 8, no. 2, 1960, pp. 219–223.
- Yasin Unlu, Scott J. Mason, Evaluation of mixed integer programming formulations for non-preemptive parallel machine scheduling problems, Computers & Industrial Engineering 58 (2010) 785–800.
Interesting analysis.
ReplyDeleteI'm pleased to see that the simplest model performed best - often modellers have a tendency to complicate models unnecessarily, just because they can.
Your point about model validation is very important. Too many times I've seen models that get an optimal solution to the wrong problem, due to errors in either the situation specification or the model implementation. Having multple models can give us confidence that we have a sensible result.
FYI, I've written a brief summary at https://www.solvermax.com/blog/parallel-machine-scheduling