Sunday, February 23, 2020

Non-convex quadratic models


Some non-convex quadratic models I


The new version of Gurobi v 9 introduced support for non-convex quadratic models. Cplex already supported some classes of such models (such as non-convex QPs). Here are just some models we can now try to solve with these solvers.

It is noted that you need to specify a solver option for Cplex and Gurobi to accept most non-convex models, Some models can be reformulated by the solver, but "real" con-convex models require the option OptimalityTarget 3 (Cplex) or NonConvex 2 (Gurobi).

1a. Largest Empty Rectangle


Given is an area with \(n\) points:

n=50 random points

The problem is to find the largest empty rectangle, i.e., without any points inside. The solution looks like:

Largest empty rectangle

The idea behind the model below, it that a data point \(p_i\) must be either left, right, below or above the rectangle. So it the coordinates of the rectangle are denoted by \(r_{x,lo},r_{x,up},r_{y_lo},r_{y_up}\), then at least one of the following constraints must hold: \[\begin{align} & p_{i,x} \le r_{x,lo} \\& p_{i,y} \le r_{y,lo} \\ & p_{i,x} \ge r_{x,up} \\& p_{i,y} \ge r_{y,up}\end{align}\] The complete mathematical model to find this rectangle can be stated as:

Smallest empty rectangle
\[\begin{align} \max\> & \color{darkred}{\mathit{area}} = \prod_c (\color{darkred}r_{c,up}-\color{darkred}r_{c,lo}) \\ & \color{darkblue}p_{i,c} \le \color{darkred}r_{c,lo}+\color{darkblue}M_c(1-\color{darkred}\delta_{i,c,lo}) && \forall i,c\\ & \color{darkblue}p_{i,c} \ge \color{darkred}r_{c,up}-\color{darkblue}M_c(1-\color{darkred}\delta_{i,c,up}) && \forall i,c\\ & \sum_{c,s} \color{darkred}\delta_{i,c,s} \ge 1 && \forall i \\ & \color{darkblue} L_c \le \color{darkred}r_{c,lo} \le \color{darkred}r_{c,up}\le \color{darkblue} U_c \\ & \color{darkred}\delta_{i,c,s} \in \{0,1\} \end{align} \]
\[\begin{align} & i = \{1,\dots,n\}\\ & c = \{x,y\}\\ & s = \{lo,up\} \\ & \color{darkblue} L_c = 0 \\ & \color{darkblue} U_c = 1 \\ & \color{darkblue} M_c = \color{darkblue}U_c-\color{darkblue}L_c \end{align}\]


For the 2d case, the objective can be written as \[\min\> \mathit{area} = (r_{x,up}-r_{x,lo})\cdot (r_{y,up}-r_{y,lo})\] The outer area is a given rectangle with limits \(L_x, U_x, L_y, U_x\). I used here the unit box: \(L_x=L_y = 0\) and \(U_x=U_y = 1\). The big-M constants \(M_x, M_y\) need tight values. We can use the area dimensions for that. For a formulation without big-M constraints, we can implement implications using indicator constraints: \[\begin{align} & \delta_{i,c,lo}=1 \implies p_{i,c} \le r_{i,c,lo} \\ &\delta_{i,c,up}=1 \implies p_{i,c} \ge r_{i,c,up}\end{align}\]

The problem is not convex because of the objective. It is noted that we can simplify the objective a bit by writing \[\begin{align}\max\>&\mathit{area} =  \mathit{side}_x \cdot \mathit{side}_y\\ & \mathit{side}_c = r_{c,up} - r_{c,lo}\end{align}\] This is a small problem with just 4 continuous quadratic variables and \(n\times 4\) binary variables. Gurobi or Cplex solve this model very quickly.

This problem can also be solved with available global solvers. Both Antigone and Baron have no problems with it. The optimal solution looks like:


----     50 VARIABLE area.L                =        0.113  

----     50 VARIABLE r.L  coordinates of rectangle

           lo          up

x       0.403       0.792
y       0.285       0.575


I also tried the open-source solver Couenne. That was a surprise:


               S O L V E      S U M M A R Y

     MODEL   m                   OBJECTIVE  area
     TYPE    MIQCP               DIRECTION  MAXIMIZE
     SOLVER  COUENNE             FROM LINE  49

**** SOLVER STATUS     1 Normal Completion         
**** MODEL STATUS      1 Optimal                   
**** OBJECTIVE VALUE                0.0121


----     50 VARIABLE area.L                =        0.012  

----     50 VARIABLE r.L  coordinates of rectangle

           lo          up

x       0.939       0.982
y       0.720       1.000

This is obviously a bad and non-optimal solution. I don't know what is going on here, but this looks like a bug.

1b. 3d largest box 


It is easy to make a 3d version of the model above. Just replace \(c = \{x,y\}\) by \(c = \{x,y,z\}\). Unfortunately, there is one complication: now the model has an objective that is no longer quadratic. However, this can be fixed by chaining: \[\begin{align} \max\>&\mathit{volume} = \mathit{area}\cdot \mathit{side}_{z}\\ & \mathit{area} = \mathit{side}_{x} \cdot \mathit{side}_{y} \\ & \mathit{side}_c = r_{c,up}-r_{c,lo} && \forall c \end{align}\]

A solution can look like:


----     52 VARIABLE area.L                =        0.340  
            VARIABLE volume.L              =        0.146  

----     52 VARIABLE r.L  coordinates of 3D box

           lo          up

x       0.608       0.998
y       0.091       0.963
z       0.140       0.571

Note that Cplex can not handle this problem: Cplex allows non-convexities in the objective only (apart for some cases where it can reformulate things).

3d version: largest box not clashing with given points



2. Hostile Brothers Problem



A father is facing the following problem. He has has a plot of land. And he has \(n\) sons, who don't get along at all. In order to minimize friction, he wants to build a house for each of his sons, such that the minimum distance between them is maximized. The mathematical model can look like:


Hostile Brothers Problem
\[\begin{align} \max\> & \color{darkred}z \\ & \color{darkred}z \le \sum_{c} (\color{darkred}x_{i,c}-\color{darkred}x_{j,c})^2&& \forall i\lt j\\ & \color{darkred}x_{i,c} \in [\color{darkblue}L,\color{darkblue}U]\end{align} \]

Here \(i\) indicates the point, \(i \in \{1,\dots,n\}\) and \(c\) is the coordinate. For a 2D problem, we have \(c \in \{x,y\}\).

When we use \(n=50\) random points, Gurobi is not able to even come close to proven optimality, I used a time limit of 2 hours, and the final gap is enormous:


 1560307 1354785    0.40487   42 1653    0.01935    0.67173  3371%   332 7171s
 1562085 1356368     cutoff  283         0.01935    0.67173  3371%   332 7177s
 1563023 1356980     cutoff  285         0.01935    0.67173  3371%   332 7180s
 1564565 1358071    0.01937  248  930    0.01935    0.67173  3371%   332 7190s
 1565720 1359205    0.60071   48 1510    0.01935    0.67173  3371%   332 7196s
 1567335 1359826     cutoff  272         0.01935    0.67173  3371%   332 7200s


Cutting planes:
  RLT: 1312

Explored 1567341 nodes (520860935 simplex iterations) in 7200.00 seconds
Thread count was 8 (of 16 available processors)


Summary:

  • Best found solution: 0.01935
  • Best possible objective value: 0.6173
  • Relative gap: 3371%

The bug gap is partly due to the small best-found solution (0.01935). But mainly: this is a very difficult job for Gurobi. Even when the problem is relatively small: just 100 continuous variables,

By inspecting the solution, we see this solution is not very good at all:

Solution found by Gurobi after 2 hours (obj=0.01935)


I expect a much more regular pattern.

We can try to improve performance by noticing there is much symmetry in the model. The same solution can be obtained by just renumbering the points. To break symmetry, I added the constraint: \[\sum_c x_{i,c} \ge \sum_c x_{i-1,c}\] This was not very successful:


 2978972 2407171    0.01708  130  813    0.00885    0.64271  7159%   222 7180s
 2981109 2409232    0.52720   64 2378    0.00885    0.64271  7159%   222 7185s
 2984648 2411500    0.20399  107 2156    0.00885    0.64271  7159%   222 7191s
 2986302 2412811    0.12220  127 1721    0.00885    0.64271  7159%   222 7195s
 2987852 2413879    0.06942  175 1728    0.00885    0.64271  7159%   222 7200s

Explored 2988184 nodes (662244035 simplex iterations) in 7200.01 seconds
Thread count was 8 (of 16 available processors)

So we see:
  • The best-found solution (0.00885) is worse than before (0.01935)
  • The best possible solution (0.64271) is worse than before (0.6173)
  • The relative gap is worse (7159% vs 3371%)
As an alternative, let's try Baron on this model. I dropped the symmetry breaking constraint here (adding this constraint was not beneficial, at least not during the initial period of the search). We see in the first few minutes:


  Iteration    Open nodes         Time (s)    Lower bound      Upper bound
*         1             1             5.51     0.257746E-01     2.00000    
*         1             1             7.22     0.274782E-01     2.00000    
          1             1             7.41     0.274782E-01     2.00000    
         12             7            49.51     0.274782E-01     1.72727    
         31            16            80.03     0.274782E-01     1.72727    
         46            24           111.27     0.274782E-01     1.72727    

I.e. Baron finds a much better solution very quickly (but is far from getting the gap close to zero).

Another approach is to use a multi-start approach with a local NLP solver. Generate a random starting point, solve, and repeat. When I do this 20, I see when I use the CONOPT NLP solver:


----     57 PARAMETER objs  objective values of NLP solver

cycle1  0.02751,    cycle2  0.02669,    cycle3  0.02705,    cycle4  0.02041,    cycle5  0.02620,    cycle6  0.01235
cycle7  0.02675,    cycle8  0.02590,    cycle9  0.02660,    cycle10 0.02679,    cycle11 0.02558,    cycle12 0.02644
cycle13 0.02622,    cycle14 0.02625,    cycle15 0.02620,    cycle16 0.02749,    cycle17 0.02742,    cycle18 0.02692
cycle19 0.02675,    cycle20 0.02688,    best    0.02751

This is the best we have seen so far. By accident, the best results were found in the first cycle. The results look like:

Best solution with Multi-start Conopt (obj=0.02751)

This solution looks much better than the one Gurobi found. In addition, CONOPT needs only 30 seconds to solve these 20 problems.

Conclusion:

  • Cplex cannot solve this problem
  • Gurobi finds a solution with objective 0.01935 (time limit: 2 hours)
  • Baron finds a solution of  0.02748 in 100 seconds
  • Conopt finds a solution of 0.02751 in a loop of 20 solves that takes 30 seconds
  • Increasing this to 200 solves gives an objective of 0.02771
  • Remember that we are maximizing
Even though this is a non-convex quadratic problem, it makes sense to look beyond quadratic solvers.



Monday, February 10, 2020

Longest path problem


This is a question that regularly pops up: how to solve the longest path problem? At first sight, this is easy. Just replace the "minimize" in the shortest path problem by "maximize" and we are good to go. Unfortunately, opposed to the well-known shortest path problem, the longest path problem is not that easy to state and to solve.

1. Standard Shortest Path Formulation


The standard shortest path can be formulated as a Mixed Integer Programming problem:

Shortest Path Problem
\[\begin{align} \min& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ &\color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkblue} b_i = \begin{cases}1 & \text{if $i$ is a source node} \\ -1 & \text{if $i$ is a sink node} \\ 0 & \text{otherwise} \end{cases} \end{align} \]

where \(A\) is our network and \(b\) is a sparse data vector indicating the exogenous inflow or outflow at the source or sink node.

Network with the shortest path

There are some implicit assumptions in this model: there are no negative cycles. In the above example, \(d_{i,j}\) represents distance, so we have \(d_{i,j}\ge 0\) and there is no problem of negative cycles.

Shortest path problems are very easy MIPs to solve. The solution is automatically integer, so it can be solved as an LP and these LPs are very sparse (only two non-zero elements per column) and very easy to solve. In addition, specialized network solvers are widely available.

The problem in the picture has 20 nodes and 37 × 2 arcs. In this example, the arcs are duplicated to make sure we can go in two directions. This leads to an LP of 74 variables and 20 constraints. The solution looks like:


----     75 VARIABLE z.L                   =      128.538  objective

----     75 VARIABLE x.L  flow

            point6(B)     point15     point16     point18     point20

point5(A)                               1.000
point15         1.000
point16                                             1.000
point18                                                         1.000
point20                     1.000


2. Just use max


When we just turn the above problem in a maximization problem, we get results that are difficult to interpret. But they certainly don't form what we would call a path:


----     78 VARIABLE z.L                   =     1638.406  objective

----     78 VARIABLE x.L  flow

               point1      point2      point3      point4   point5(A)   point6(B)      point7      point8      point9

point1                                              1.000
point3                                                                                                          1.000
point4          1.000
point5(A)                                                                                           1.000
point6(B)                                                                               1.000
point7                                                                      1.000
point8                                                          1.000
point9                                  1.000
point10                     1.000
point12                                                                                                         1.000
point13                                                                                 1.000
point14                                             1.000                                           1.000
point15                                                                     1.000
point16                                                                                             1.000
point17         1.000                               1.000
point18                                 1.000                                                                   1.000
point19                                                                                 1.000
point20                     1.000

        +     point10     point11     point12     point13     point14     point15     point16     point17     point18

point1                                                                                              1.000
point2          1.000
point3                                                                                                          1.000
point4                                                          1.000                               1.000
point5(A)                                                                               1.000
point7                                              1.000
point8                                                          1.000                   1.000
point9                                  1.000                                                                   1.000
point10                                                                     1.000
point11                                                                                                         1.000
point12                                                                                 1.000                   1.000
point14                                                                                 1.000       1.000
point15         1.000
point16                                 1.000                   1.000                               1.000       1.000
point17                                                         1.000                   1.000                   1.000
point18                     1.000       1.000                                                       1.000
point19                                             1.000                   1.000                   1.000       1.000
point20         1.000       1.000                   1.000                   1.000

        +     point19     point20

point2                      1.000
point7          1.000
point10                     1.000
point11                     1.000
point13         1.000       1.000
point15         1.000
point17         1.000
point18         1.000       1.000
point19                     1.000
point20         1.000

Maximization results

Basically, all but a few lines in the plot are traveled in both directions. Conclusion: this is not what we are looking for.

3. TSP-like solution


To form a proper path (called an elementary path), we need to add two sets of constraints [1]:
  1. The outflow of each node goes to just one arc: \[\sum_{j|(i,j)\in A} x_{i,j} \le 1\>\forall i\]
  2. Forbid any sub-tours to be formed. Sub-tour elimination constraints are well-known from the Traveling Salesman Problem. 
Using the simple MTZ (Miller, Tucker, and Zemlin) approach, we can formulate:


Longest Path Problem
\[\begin{align} \max& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ & \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} \le 1 && \forall i\\ & \color{darkred}t_j \ge \color{darkred}t_i + 1 + (\color{darkblue}n-1)(\color{darkred}x_{i,j}-1) && \forall i,j|i\ne\text{source},j\ne\text{sink} \\&\color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}t_i \ge 0  \end{align} \]

Here \(n\) is the number of nodes.

The results look like:


----    126 VARIABLE z.L                   =      432.987  objective

----    126 VARIABLE x.L  flow

               point1      point2      point3      point4   point6(B)      point7      point8      point9     point10     point12

point2                                                                                                          1.000
point4          1.000
point5(A)                                                                               1.000
point9                                  1.000
point12                                                                                             1.000
point14                                             1.000
point15                                                         1.000
point16                                                                                                                     1.000
point19                                                                     1.000
point20                     1.000

        +     point13     point14     point15     point16     point17     point18     point19     point20

point1                                                          1.000
point3                                                                      1.000
point7          1.000
point8                      1.000
point10                                 1.000
point13                                                                                             1.000
point17                                             1.000
point18                                                                                 1.000

Solution of Longest Elementary Path Model


This looks much better.

More formulations can be found in [1].

4. Ad-hoc approach


Let's drop the sub-tour elimination constraints, and have a look at the solution of the remaining model:


Problem A: drop sub-tour elimination constraints
\[\begin{align} \max& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ & \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} \le 1 && \forall i\\ &\color{darkred}x_{i,j} \in \{0,1\}  \end{align} \]

When we look at the solution we see a number of sub-tours of just two points. This means: travel from \(C\rightarrow D\) and back.

Sub-tours with 2 points


These special sub-tours are easily prevented: add a cut of the form \[x_{i,j}+x_{j,i} \le 1\] We can add them only for the \(i \rightarrow j \rightarrow i\) offenders or just for all possible \(i \lt j\). Here I did the last thing. The model becomes:


Problem B: add cuts
\[\begin{align} \max& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ & \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} \le 1 && \forall i\\ & \color{darkred}x_{i,j}+\color{darkred}x_{j,i} \le 1 && \forall i \lt j \\ &\color{darkred}x_{i,j} \in \{0,1\}  \end{align} \]

After adding these cuts and solving problem B we see:

Results after adding cuts

The solution has no new sub-tours, so we can conclude this is the optimal solution.

Thus far this is a bit of an ad-hoc method. This approach would fail if we observe more complex sub-tours. However, it is not too difficult to make this more general. We can add appropriate cuts when we observe sub-tours and resolve the model. Such a cutting plane algorithm can actually outperform models where we add sub-tour elimination constraints in advance. With modern solvers, it is even possible to add this cut generation inside the branch-and-bound search (i.e. no resolve needed).

Conclusion


The conclusion is: "solve the longest path problem" is not as easy as it seems. We need to employ TSP-like machinery to solve this problem. That means no easy MIP models. Straight MIP models are both not that simple to formulate and will only work for relatively small models. A cutting plane approach can help here.

References


  1. Leonardo Taccari, Integer programming formulations for the elementary shortest path problem, European Journal of Operational Research, Volume 252, Issue 1, 1 July 2016, Pages 122-130

Wednesday, January 29, 2020

Multi-objective optimization: min sum vs min max


The issue

In many models, we observe that an objective that minimizes a sum delivers undesirable results. Here are some examples:

Assignment under preferences

In [1], we formulated a MIP model that tackles the problem of creating groups taking into account the preferences of people. When we just maximize the total sum of preferences that are honored by the assignment, we can see that some participants get a terrible assignment. I.e., taking one for the team. It requires some thought to develop a model that produces a more equitable solution. In a sense, we are looking at trading-off overall efficiency for fairness.

Job scheduling

Typically in job scheduling models, we minimize (1) the total make-span: do things as quickly as possible and (2) the sum of tardiness of jobs. Sometimes we use a weighting scheme for tardy jobs: jobs from valuable clients may get a higher weight to make it less likely they are chosen to be tardy. There are some terms in the objective we can add, for instance:

  • The number of jobs that are tardy. The sum may distribute the pain over many jobs. If we want to reduce the number of tardy jobs we can add a term that minimizes the number of tardy jobs.
  • The maximum tardiness of a job. Adding such a term will try to prevent the situation where a single job will take the brunt of the pain.

These are examples where just minimizing the sum can lead to unpalatable solutions. It may require some effort to design a more sophisticated objective that takes into account more aspects than just overall efficiency.

Example model


In [2] I discussed a MIP formulation for a matching problem:

Given \(N=2M\) points, find a matching that minimizes the sum of the cost (represented by lengths between the matched points). 

A. Single solution


In this section, I discuss how we can look at the problem in different ways:

  1. MIN SUM: Solve the problem minimizing the sum of the cost.
  2. MIN MAX: Solve the problem minimizing the maximum cost. 
  3. MIN MAX followed by MIN SUM: this variant exploits that MIN MAX is not unique. 
  4. MIN SUM followed by MIN MAX: due to uniqueness, this is not an interesting problem.
  5. A weighted sum of MIN SUM and MIN MAX.

A.1. MIN SUM Matching 


The Min Sum problem can be stated as a MIP problem:

MIP Model for Min Sum Matching
\[\begin{align}\min& \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The optimal MIP solution looks like:


Optimal MIN SUM solution


In the solution, we see a spike: the longest length is much larger than the rest.

A.2. MIN MAX Problem


Instead of minimizing the sum, we can also minimize the length of the largest segment. Formally we want \[\min \max_{i\lt j} d_{i,j} x_{i,j}\] This can be linearized in a standard manner.

MIP Model for Min-Max Matching
\[\begin{align}\min\>& \color{darkred}z\\ & \color{darkred}z \ge \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall  i \lt j\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The results look like:

Optimal MIN MAX solution


We see we do much better than the maximum length of the MIN SUM problem.

One issue with the MIN MAX problem is that the solution is not unique. Any configuration with lengths less than the maximum length is optimal.

A.3. MIN MAX followed by MIN SUM 


Instead of just using the MIN MAX objective, we can look at the problem as a multi-objective problem:

Multi-objective Model
\[\begin{align}\min \>& \color{darkred}z_1 = \max_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ \min\>& \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

Of course we need to reformulate the \(\max\) function to make it linear. The multi-objective structure was implemented as:


  1. Solve problem with the MIN MAX objective\[\begin{align} \min\>&z_1\\ &z_1 \ge d_{i,j} x_{i,j} &&\forall i\lt j \\ & \sum_{j|j \gt i} x_{i,j} + \sum_{j|j \lt i} x_{j,i} = 1 \\ &x_{i,j} \in \{0,1\} \end{align}\] Let \(z_1^*\) be the optimal objective.
  2. Solve problem with the MIN SUM, subject to MIN MAX objective is not deteriorating (i.e. the length of each segment is less than the maximum length found in the previous step). \[\begin{align}\min\>& z_2 = \sum_{i,j|i \lt j} d_{i,j} x_{i,j} \\ & d_{i,j} x_{i,j} \le z_1^* && \forall i\lt j\\ & \sum_{j|j \gt i} x_{i,j} + \sum_{j|j \lt i} x_{j,i} = 1 && \forall i\\ & x_{i,j} \in \{0,1\}\end{align}\] 

This is typically called the lexicographic method or preemptive method.

The results look like:

Optimal multi-objective solution


We see a much better distribution of the lengths with this approach. The underlying reason is that the MIN MAX solution is not unique. There are a ton of solutions with all lengths less than the MIN-MAX objective (at least 50k: I stopped after reaching this number). With the second objective, we choose the "best" of all these solutions.

Here is a summary of the results:

ModelObjectiveSumMax
MIN SUM241.267241.26738.768
MIN MAX22.954403.26522.954
MULTI-OBJECTIVE (second solve)247.382247.38222.954

We can see that:

  • The largest segment is very long for the MIN SUM model
  • The sum is very large for the MIN MAX model.
  • The MULTI model has a sum only a bit larger than from the MIN SUM model, and the largest length is identical to the MIN MAX model. It prevents the disadvantages of the MIN SUM and MIN MAX models.

A.4. MIN SUM followed by MIN MAX


The problem formed by first solving the MIN SUM problem followed by the MIN MAX model is not of much value: we get the same solution as the MIN-SUM problem. The reason is that the MIN-SUM solution is unique. That means that running the second model does not yield a different solution.

A.5. A weighted sum of MIN SUM and MIN MAX


An alternative to the preemptive or lexicographic method is to use a composite objective with some weights:

Multi-objective Model via Weighted Sum
\[\begin{align}\min \>&  \color{darkblue}w_1 \color{darkred}z_1 +  \color{darkblue}w_2 \color{darkred}z_2 \\ & \color{darkred}z_1 \ge   \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall i,j| i \lt j \\ & \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The convention is we use \(w_i\gt 0\) and \(\sum_i w_i=1\). If we use\[\begin{cases} w_1 = 0.01 \\ w_2 = 0.99\end{cases}\] we get the same solution as in section A.3 MIN ABS followed by MIN SUM.

B. Finding All Efficient Solutions


In section A we looked at characterizing the "optimal" solution by a single point. When we have multiple objectives (min-sum and min-max), it can be insightful to looks at all "optimal" points or a representative subset. An optimal solution in this context can be defined as being non-dominated by another solution. Such a solution is called efficient. The collection of all efficient solutions is called the efficient frontier. 

B.1. Weighted sum loop


A first method is to vary the weights and solve for each weight. We don't want weights of zero (or one), so I generated the following step values:


----    135 PARAMETER step  

step0  0.010,    step1  0.050,    step2  0.100,    step3  0.150,    step4  0.200,    step5  0.250,    step6  0.300
step7  0.350,    step8  0.400,    step9  0.450,    step10 0.500,    step11 0.550,    step12 0.600,    step13 0.650
step14 0.700,    step15 0.750,    step16 0.800,    step17 0.850,    step18 0.900,    step19 0.950,    step20 0.990

If we solve our weighted sum model for these values, we get the following report:


----    147 PARAMETER front  

                w1          w2         sum         max         obj

step0        0.010       0.990     247.382      22.954      25.199
step1        0.050       0.950     247.382      22.954      34.176
step2        0.100       0.900     247.382      22.954      45.397
step3        0.150       0.850     247.382      22.954      56.619
step4        0.200       0.800     247.382      22.954      67.840
step5        0.250       0.750     247.382      22.954      79.061
step6        0.300       0.700     247.382      22.954      90.283
step7        0.350       0.650     247.382      22.954     101.504
step8        0.400       0.600     247.382      22.954     112.726
step9        0.450       0.550     247.382      22.954     123.947
step10       0.500       0.500     247.382      22.954     135.168
step11       0.550       0.450     247.382      22.954     146.390
step12       0.600       0.400     247.382      22.954     157.611
step13       0.650       0.350     247.382      22.954     168.833
step14       0.700       0.300     247.382      22.954     180.054
step15       0.750       0.250     241.267      38.768     190.642
step16       0.800       0.200     241.267      38.768     200.767
step17       0.850       0.150     241.267      38.768     210.892
step18       0.900       0.100     241.267      38.768     221.017
step19       0.950       0.050     241.267      38.768     231.142
step20       0.990       0.010     241.267      38.768     239.242

This is a bit disappointing: we only observe two different solutions. And we saw these before.

We may miss some efficient (i.e., non-dominated) solutions. In general, the weighted sum method suffers from two problems:

  1. We may miss some efficient solutions. Although each point found with the weighted sum method is non-dominated, the other way around is not true: there may be efficient solutions we cannot see with this method.
  2. If this method finds several efficient points, these points may not be nicely distributed.
It is noted that in our simple loop we can miss efficient solutions for two reasons:

  • The inherent problem of the weighted sum model: it can miss solutions.
  • The way we set up the loop: we only looked at 20 different weights. When we increase this, we may see more non-dominated solutions.

There is another method we can use.

B.2. Find non-dominated solutions


We can actively look for non-dominated solutions as follows:

  1. Start with arbitrary weights \(w_1, w_2 \in (0,1)\) to find a first solution on the efficient frontier. E.g. \(w_1 = w_2 = 0.5.\)
  2. Add to the model constraints on the objective values \(z_1, z_2\) such that a new solution is better in one of the objectives than any other found efficient solution
  3. Also, add a cut to the model that forbids the current solution.
  4. Solve. If infeasible, STOP. We are done.
  5. Go to step 2.

The model solved at each step \(K\) looks like:


MIP Model in search for non-dominated solutions
\[\begin{align}\min \>& \color{darkblue}w_1 \color{darkred}z_1 + \color{darkblue}w_2 \color{darkred}z_2 \\ & \color{darkred}z_1 \ge \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall i \lt j \\ & \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i \\ & \color{darkred}z_1 \le \color{darkblue}z_1^{(k)} - \color{darkblue}\epsilon + \color{darkblue}M_1 \color{darkred}\delta && k=1,\dots,K-1\\ & \color{darkred}z_2 \le \color{darkblue}z_2^{(k)} - \color{darkblue}\epsilon + \color{darkblue}M_2 (1-\color{darkred}\delta) && k=1,\dots,K-1 \\ & \sum_{i,j|i\lt j} (2 \color{darkblue}x^{(k)}_{i,j}-1) \color{darkred}x_{i,j} \le \sum_{i,j|i\lt j} \color{darkblue}x^{(k)}_{i,j}-1 && k=1,\dots,K-1 \\ & \color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}\delta \in \{0,1\} \end{align} \]

Here \(z^{(k)}\) and \(x^{(k)}\) indicate solutions found in earlier iterations. \(\epsilon\) is a small constant to make sure we are actually improving an objective. Good values for \(M\) can be found using our earlier optimization models. E.g.: \[\begin{align} &M_1 = 38.768 - 22.945 + \epsilon \\
& M_2 = 247.382 - 241.267 + \epsilon\end{align}\] The constraints \[\begin{align} &z_1 \le  z_1^{(k)} - \epsilon + M_1 \delta\\ &z_2 \le z_2^{(k)} - \epsilon + M_2 (1-\delta)\end{align}\] say: one of the objectives should be (strictly) better compared to any other point we have found earlier. Finally, the constraint: \[\sum_{i,j|i\lt j} (2 x^{(k)}_{i,j}-1) x_{i,j} \le \sum_{i,j|i\lt j} x^{(k)}_{i,j}-1\] forbids any previously found point to be considered again.

When we run this algorithm, we find the following solutions:


----    228 PARAMETER front2  

               sum         max         obj

point1   247.38237    22.95441   135.16839
point2   243.76532    33.90486   138.83509
point3   241.26672    38.76785   140.01729

We see that point2 is a newly found efficient solution. In addition, we only need to solve 4 models (the last model, not shown in the results, is infeasible). This model is substantially more complicated than the previous loop with different weights. But we get something in return for this additional complexity: this algorithm produces all efficient points and depending on the number of non-dominated solutions, may need fewer models to solve.

Conclusion


In practical models, just what constitutes an optimal solution, is not always easy to define. An overall best solution may hurt some individuals disproportional. In our simple example, we saw a MIN SUM solution contained a few very bad matches. Trying to fix this with a multi-objective approach requires some thought. And this may give rise to new questions such as: give me all (or some subset) of the non-dominated solutions. This again is not an easy question.

References


Tuesday, January 28, 2020

How to model y=min(x1,x2)


Introduction


The problem of modeling \(y=\min(x_1,x_2)\) in optimization models is not completely trivial. Here are some notes.

Small example model


To experiment, here is a small test model:

MINLP Model
\[\begin{align} \max\>& \color{darkred} z= \color{darkred}x_1 + 2\color{darkred}x_2\\ & 2\color{darkred}x_1 + \color{darkred}x_2 = 5 + \min(\color{darkred}x_1,\color{darkred}x_2)\\ & \color{darkred}x_1,\color{darkred}x_2 \in [0,4] \end{align}\]


Global MINLP solvers


Interestingly the global MINLP solvers Baron, Couenne, and Antigone (under GAMS) have no direct facilities for this function:


SolverError message
Baron*** Cannot handle function 'min'
CouenneGams function min not supported.
AntigoneERROR: Unsupported function min in equation e0


This is unfortunate. I think it would be better if this function was supported directly. Of course, we are not defeated that easily. As the solvers do support the \(\mathrm{abs}()\) function, lets see if we can use that instead. A possible reformulation is to replace \(\min(x_1,x_2)\) by \[\min(x_1,x_2)=\frac{x_1+x_2-|x_1-x_2|}{2}\] The derivation is as follows: \[\frac{x_1+x_2-|x_1-x_2|}{2} = \begin{cases}\displaystyle\frac{x_1+x_2-x_1+x_2}{2}=x_2 & \text{if $x_1\ge x_2$}\\ \displaystyle\frac{x_1+x_2+x_1-x_2}{2}=x_1 & \text{if $x_1\lt x_2$}\end{cases}\] Using this we see:

                           LOWER          LEVEL          UPPER

---- VAR z                 -INF            9.0000        +INF         
---- VAR x1                  .             1.0000         4.0000      
---- VAR x2                  .             4.0000         4.0000      


MIP Formulation


The construct \(y=\min(x_1,x_2)\) can be interpreted as \[\begin{align} & y\le x_1 \> \mathbf{and}\> y\le x_2\\ & y\ge x_1 \>\mathbf{or}\> y\ge x_2 \end{align}\] With this, we can form the set of constraints: 


MIP big-M reformulation of min function
\[\begin{align} & \color{darkred} y \le \color{darkred}x_1 \\ & \color{darkred} y \le \color{darkred}x_2\\ & \color{darkred} y \ge \color{darkred}x_1 - \color{darkblue}M \color{darkred}\delta \\ & \color{darkred} y \ge \color{darkred}x_2 - \color{darkblue}M (1-\color{darkred}\delta)\\ & \color{darkred}\delta \in \{0,1\} \end{align}\]

The main problem here is that we need a good value for \(M\). In this case, \(M\) should be the largest difference between \(x_1\) and \(x_2\) we can observe. In our case \(x_1,x_2\in[0,4]\) so a good value would be \(M=4\). The solution can look like:

                           LOWER          LEVEL          UPPER         MARGINAL

---- VAR z                 -INF            9.0000        +INF             .          
---- VAR x1                  .             1.0000         4.0000          .          
---- VAR x2                  .             4.0000         4.0000         1.0000      
---- VAR y                   .             1.0000        +INF             .          
---- VAR delta               .              .             1.0000         EPS         

It may not always be so easy to derive good values. In that case, an alternative is to use SOS1 variables (SOS1 stands for Special Ordered Set of type 1). This type of variable is supported by most MIP solvers (but not all).

MIP SOS1 reformulation of the min function
\[\begin{align} & \color{darkred} y \le \color{darkred}x_1 \\ & \color{darkred} y \le \color{darkred}x_2\\ & \color{darkred} y \ge \color{darkred}x_1 - \color{darkred}s_1 \\ & \color{darkred} y \ge \color{darkred}x_2 - \color{darkred}s_2\\ & \color{darkred}s_1,\color{darkred}s_2 \ge 0 \\ & (\color{darkred}s_1,\color{darkred}s_1) \in SOS1\end{align}\]

The SOS1 condition says: only one of the variables \((s_1,s_2)\) can be non-zero (note that they can be both zero). This means: if one of the variables is non-zero, the other one will be zero. As you can see, there are no bounds needed on \(s_i\), so this formulation is useful when we can not find a good value for \(M\) in the previous formulation.

I want to mention that some solvers have a built-in function for \(\min()\). That makes life easier, as there is no reformulation required (the solver will do this behind the scenes).

How not to handle the min function


I see often the following suggestion to handle the above model:

Don't do this
\[\begin{align} \max\>& \color{darkred} z= \color{darkred}x_1 + 2\color{darkred}x_2 + \color{darkblue} \alpha \cdot\color{darkred}y \\ & 2\color{darkred}x_1 + \color{darkred}x_2 = 5 + \color{darkred}y \\ & \color{darkred}y \le \color{darkred}x_1 \\ & \color{darkred}y \le \color{darkred}x_2\\ & \color{darkred}x_1,\color{darkred}x_2 \in [0,4] \end{align}\]

Here \(\alpha\gt 0\) is a constant. The idea is that the objective will now push \(y\) upwards, so we don't need to worry anymore about \[y \ge x_1 \> \mathbf{or} \> y \ge x_2\] The problem is that we are changing the problem. Indeed the results (with \(\alpha=10\)) look like:


                           LOWER          LEVEL          UPPER         MARGINAL

---- VAR z                 -INF           32.5000        +INF             .          
---- VAR x1                  .             2.5000         4.0000          .          
---- VAR x2                  .             2.5000         4.0000          .          
---- VAR y                   .             2.5000        +INF             .          

This means the objective \(z_0=x_1+2x_2\) is equal to 7.5 which is less than the objective we saw earlier (we had before \(z=9\)). So this trick is just not the right thing to do.

There are values for \(\alpha\) that actually work. Using \(\alpha=0.1\) we observe:


                           LOWER          LEVEL          UPPER         MARGINAL

---- VAR z                 -INF            9.1000        +INF             .          
---- VAR x1                  .             1.0000         4.0000          .          
---- VAR x2                  .             4.0000         4.0000         0.9000      
---- VAR y                   .             1.0000        +INF             .          

This corresponds to the solution we found earlier. With some experimentation it looks like we have: \[\begin{cases} \alpha \le 1 \Rightarrow x_1=1,x_2=4\\ \alpha \gt 1 \Rightarrow x_1,x_2=2.5\end{cases}\] I am not sure if in general there is an easy way to find the threshold or even if there is always some value of \(\alpha\) that works. I don't particularly like this approach as it really changes the problem. As a result, I never use this method.

Monday, January 20, 2020

A scheduling problem


I have a job scheduling problem with a twist- a minimization constraint. The task is- I have many jobs, each with various dependencies on other jobs, without cycles. These jobs have categories as well, and can be ran together in parallel for free if they belong to the same category. So, I want to order the jobs so that each job comes after its dependencies, but arranged in such a way that they are grouped by category (to run many in parallel) to minimize the number of serial jobs I run. That is, adjacent jobs of the same category count as a single serial job [1].
It is always difficult to form a model from an informal description. For me, it often helps to develop some example data. So here we go:


----     28 SET j  jobs

job1 ,    job2 ,    job3 ,    job4 ,    job5 ,    job6 ,    job7 ,    job8 ,    job9 ,    job10,    job11,    job12
job13,    job14,    job15,    job16,    job17,    job18,    job19,    job20,    job21,    job22,    job23,    job24
job25,    job26,    job27,    job28,    job29,    job30,    job31,    job32,    job33,    job34,    job35,    job36
job37,    job38,    job39,    job40,    job41,    job42,    job43,    job44,    job45,    job46,    job47,    job48
job49,    job50


----     28 SET c  category

cat1,    cat2,    cat3,    cat4,    cat5


----     28 SET jc  job-category mapping

             cat1        cat2        cat3        cat4        cat5

job1          YES
job2                                                          YES
job3                                  YES
job4                      YES
job5                      YES
job6                      YES
job7                      YES
job8                                                          YES
job9          YES
job10                                 YES
job11                                                         YES
job12                                 YES
job13                                                         YES
job14                                             YES
job15         YES
job16                                             YES
job17         YES
job18                     YES
job19                                             YES
job20                                 YES
job21                     YES
job22                     YES
job23         YES
job24         YES
job25                                 YES
job26                                                         YES
job27                     YES
job28                                             YES
job29                                             YES
job30                     YES
job31         YES
job32                                 YES
job33         YES
job34                                                         YES
job35                     YES
job36                     YES
job37                                 YES
job38                                             YES
job39                                             YES
job40                                 YES
job41                                 YES
job42         YES
job43                     YES
job44         YES
job45                     YES
job46         YES
job47                                             YES
job48                                 YES
job49                                             YES
job50                     YES


----     28 PARAMETER length  job duration

job1  11.611,    job2  12.558,    job3  11.274,    job4   7.839,    job5   5.864,    job6   6.025,    job7  11.413
job8  10.453,    job9   5.315,    job10 12.924,    job11  5.728,    job12  6.757,    job13 10.256,    job14 12.502
job15  6.781,    job16  5.341,    job17 10.851,    job18 11.212,    job19  8.894,    job20  8.587,    job21  7.430
job22  7.464,    job23  6.305,    job24 14.334,    job25  8.799,    job26 12.834,    job27  8.000,    job28  6.255
job29 12.489,    job30  5.692,    job31  7.020,    job32  5.051,    job33  7.696,    job34  9.999,    job35  6.513
job36  6.742,    job37  8.306,    job38  8.169,    job39  8.221,    job40 14.640,    job41 14.936,    job42  8.699
job43  8.729,    job44 12.720,    job45  8.967,    job46 14.131,    job47  6.196,    job48 12.355,    job49  5.554
job50 10.763


----     28 SET before  dependencies

             job3        job9       job13       job21       job23       job27       job32       job41       job42

job1          YES
job3                      YES
job4                                  YES
job8                                                                      YES
job9                                              YES         YES
job12                                                         YES
job14                                                                                             YES
job21                                                                     YES
job26                                                                                                         YES
job31                                                                                 YES

    +       job43       job46       job48

job10                     YES         YES
job11         YES


----     28 PARAMETER due  some jobs have a due date

job16 50.756,    job19 57.757,    job20 58.797,    job25 74.443,    job29 65.605,    job32 55.928,    job50 58.012

So we have 50 jobs and 5 categories. The set \(jc_{j,c}\) indicates if job \(j\) corresponds to category \(c\). Jobs have a duration or processing time, and some jobs have a precedence structure: the set \(\mathit{before}_{i,j}\) indicates that job \(i\) has to executed before job \(j\). Besides, some jobs have a due date.

The way I generated the due dates and the processing times suggests a continuous time model. The main variables are: \[\begin{cases} \mathit{start}_j  \ge 0& \text{start of job $j$}\\ \mathit{end}_j \ge 0& \text{end of job $j$}\\  \delta_{i,j} \in \{0,1\} & \text{binary variable used to model no-overlap constraints}\end{cases}\]

As the objective function, I use: minimize the total makespan. The idea is that this will automatically try to use as much as possible parallel jobs of the same category. We have the following constraints:

  1. A job finishes at start time plus job length.
  2. For some jobs, we have a due date. This becomes an upper bound on the end variable in the model. In practical models, you may want to allow a job to finish later than the due but at a cost. This would make it easier to generate meaningful results if not all due dates can be met.
  3. The precedence constraints are simple: job \(j\) can not start before job \(i\) has finished.
  4. No-overlap constraints require some thought. We use a binary variable to indicate that job \(i\) is executed before or after job \(j\). This has to hold for a subset of \((i,j)\) combinations: (a) only if \(i\lt j\): we don't want to check a pair twice, (b) only if there is no precedence constraint already in effect, and (c) only if jobs are of a different category. We can store the results of these three conditions in a set \(\mathit{NoOverlap}_{i,j}\) which indicates which elements \((i,j)\) need the no-overlap constraints. This set will be used in the model below.
With this in mind, we can formalize this as:


Mixed Integer Programming Model
\[\begin{align} \min\> & \color{darkred}{\mathit{makespan}}\\ & \color{darkred}{\mathit{makespan}} \ge \color{darkred}{\mathit{end}}_j && \forall j\\ & \color{darkred}{\mathit{end}}_j = \color{darkred}{\mathit{start}}_j + \color{darkblue}{\mathit{length}}_j&& \forall j \\ & \color{darkred}{\mathit{end}}_j \le \color{darkblue}{\mathit{due}}_j && \forall j | \color{darkblue}{\mathit{due}}_j \text{ exists}\\ & \color{darkred}{\mathit{end}}_i \le \color{darkred}{\mathit{start}}_j && \forall i,j|\color{darkblue}{\mathit{Precedence}}_{i,j} \text{ exists}\\ & \color{darkred}{\mathit{end}}_i \le \color{darkred}{\mathit{start}}_j +\color{darkblue} M \color{darkred}\delta_{i,j} && \forall i,j|\color{darkblue}{\mathit{NoOverlap}}_{i,j} \\ &\color{darkred}{\mathit{end}}_j \le \color{darkred}{\mathit{start}}_i +\color{darkblue} M (1-\color{darkred}\delta_{i,j}) && \forall i,j|\color{darkblue}{\mathit{NoOverlap}}_{i,j} \\ & \color{darkred}{\mathit{start}}_j, \color{darkred}{\mathit{end}}_j \ge 0 \\ & \color{darkred}\delta_{i,j} \in \{0,1\}\end{align} \]


Here \(M\) is a large enough constant, e.g., the planning window. There are many variations on this model (Constraint Programming formulations, alternative modeling to prevent the big-M constructs, such as SOS1 variables or indicator constraints). Still, as the first line of attack, this is not too bad. This formulation is similar to the classic job-shop formulation by Alan Manne [1]. The main issue with this model is that some thought has gone into developing the set \(\mathit{NoOverlap}_{i,j}\), which indicates which combinations of jobs \((i,j)\) we need to protect against being executed at the same time.

We can substitute out \(\mathit{start}_j\) or \(\mathit{end}_j\). I would prefer to keep them both in the model to improve readability.

The model solves quickly (30 seconds on my laptop). It is not very large for modern solvers:


MODEL STATISTICS

BLOCKS OF EQUATIONS           5     SINGLE EQUATIONS        2,058
BLOCKS OF VARIABLES           4     SINGLE VARIABLES        1,073
NON ZERO ELEMENTS         6,060     DISCRETE VARIABLES        972


When we solve this, we get the following results:


----     67 VARIABLE start.L  start of job

job1  36.099,    job2  23.541,    job3  61.714,    job4   2.924,    job7  87.323,    job8  25.646,    job9  72.989
job10 47.710,    job11 23.265,    job12 47.710,    job13 23.265,    job14 10.763,    job15 36.099,    job16 10.763
job17 36.859,    job18 87.323,    job19 10.763,    job20 47.710,    job21 87.323,    job22 87.323,    job23 78.304
job24 72.989,    job25 47.710,    job26 23.265,    job27 94.753,    job28 10.763,    job29 10.776,    job31 36.099
job32 47.710,    job33 36.099,    job34 23.265,    job37 47.710,    job38 10.763,    job39 10.763,    job40 47.710
job41 47.710,    job42 36.099,    job43 87.323,    job44 72.989,    job46 73.192,    job47 10.763,    job48 60.634
job49 10.763


----     67 VARIABLE end.L  end of job

job1   47.710,    job2   36.099,    job3   72.989,    job4   10.763,    job5    5.864,    job6    6.025
job7   98.736,    job8   36.099,    job9   78.304,    job10  60.634,    job11  28.993,    job12  54.467
job13  33.521,    job14  23.265,    job15  42.880,    job16  16.104,    job17  47.710,    job18  98.535
job19  19.657,    job20  56.297,    job21  94.753,    job22  94.787,    job23  84.609,    job24  87.323
job25  56.510,    job26  36.099,    job27 102.754,    job28  17.018,    job29  23.265,    job30   5.692
job31  43.119,    job32  52.761,    job33  43.795,    job34  33.264,    job35   6.513,    job36   6.742
job37  56.017,    job38  18.932,    job39  18.984,    job40  62.350,    job41  62.646,    job42  44.798
job43  96.052,    job44  85.708,    job45   8.967,    job46  87.323,    job47  16.959,    job48  72.989
job49  16.317,    job50  10.763

(Some jobs have a start time of zero. They are not printed here.)

For scheduling models, this type of output is difficult to interpret by humans. Better is something like a Gantt chart:

Results for example data set


Indeed, we see that many jobs of the same category are executed in parallel. We have two category 1 and 2 periods. This is due to due dates and precedence constraints. (If we did not have due dates or precedence constraints, we would see just a single region for each category.) Also, we understand that some jobs have a little bit of freedom to float within a window. For instance, job 4 could have started a bit earlier. We can fix this in post-processing or tell the model to move jobs to the left when possible (this would mean an additional term in the objective).

Note that commercial solvers have no problems with this model and data set.  For instance Cplex shows:

Root node processing (before b&c):
  Real time             =    3.16 sec. (583.75 ticks)
Parallel b&c, 8 threads:
  Real time             =   33.03 sec. (16153.92 ticks)
  Sync time (average)   =    4.97 sec.
  Wait time (average)   =    0.01 sec.
                          ------------
Total (root+branch&cut) =   36.19 sec. (16737.66 ticks)
MIP status(101): integer optimal solution
Cplex Time: 36.20sec (det. 16737.67 ticks)

However, I see more problems with open source and academic code. E.g., SCIP on NEOS shows:

SCIP Status        : problem is solved [optimal solution found]
Solving Time (sec) : 13010.25
Solving Nodes      : 5602808
Primal Bound       : +1.02753723219890e+02 (78 solutions)
Dual Bound         : +1.02753723219890e+02
Gap                : 0.00 %

This is quite a dramatic difference.

References


  1. Job scheduling with minimization by parallel grouping, https://stackoverflow.com/questions/59722383/job-scheduling-with-minimization-by-parallel-grouping
  2. Alan Manne, On the Job-Shop Scheduling Problem, Operations Research, 1960, vol. 8, issue, pp. 219-223.