Wednesday, January 29, 2020

Multi-objective optimization: min sum vs min max


The issue

In many models, we observe that an objective that minimizes a sum delivers undesirable results. Here are some examples:

Assignment under preferences

In [1], we formulated a MIP model that tackles the problem of creating groups taking into account the preferences of people. When we just maximize the total sum of preferences that are honored by the assignment, we can see that some participants get a terrible assignment. I.e., taking one for the team. It requires some thought to develop a model that produces a more equitable solution. In a sense, we are looking at trading-off overall efficiency for fairness.

Job scheduling

Typically in job scheduling models, we minimize (1) the total make-span: do things as quickly as possible and (2) the sum of tardiness of jobs. Sometimes we use a weighting scheme for tardy jobs: jobs from valuable clients may get a higher weight to make it less likely they are chosen to be tardy. There are some terms in the objective we can add, for instance:

  • The number of jobs that are tardy. The sum may distribute the pain over many jobs. If we want to reduce the number of tardy jobs we can add a term that minimizes the number of tardy jobs.
  • The maximum tardiness of a job. Adding such a term will try to prevent the situation where a single job will take the brunt of the pain.

These are examples where just minimizing the sum can lead to unpalatable solutions. It may require some effort to design a more sophisticated objective that takes into account more aspects than just overall efficiency.

Example model


In [2] I discussed a MIP formulation for a matching problem:

Given \(N=2M\) points, find a matching that minimizes the sum of the cost (represented by lengths between the matched points). 

A. Single solution


In this section, I discuss how we can look at the problem in different ways:

  1. MIN SUM: Solve the problem minimizing the sum of the cost.
  2. MIN MAX: Solve the problem minimizing the maximum cost. 
  3. MIN MAX followed by MIN SUM: this variant exploits that MIN MAX is not unique. 
  4. MIN SUM followed by MIN MAX: due to uniqueness, this is not an interesting problem.
  5. A weighted sum of MIN SUM and MIN MAX.

A.1. MIN SUM Matching 


The Min Sum problem can be stated as a MIP problem:

MIP Model for Min Sum Matching
\[\begin{align}\min& \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The optimal MIP solution looks like:


Optimal MIN SUM solution


In the solution, we see a spike: the longest length is much larger than the rest.

A.2. MIN MAX Problem


Instead of minimizing the sum, we can also minimize the length of the largest segment. Formally we want \[\min \max_{i\lt j} d_{i,j} x_{i,j}\] This can be linearized in a standard manner.

MIP Model for Min-Max Matching
\[\begin{align}\min\>& \color{darkred}z\\ & \color{darkred}z \ge \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall  i \lt j\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The results look like:

Optimal MIN MAX solution


We see we do much better than the maximum length of the MIN SUM problem.

One issue with the MIN MAX problem is that the solution is not unique. Any configuration with lengths less than the maximum length is optimal.

A.3. MIN MAX followed by MIN SUM 


Instead of just using the MIN MAX objective, we can look at the problem as a multi-objective problem:

Multi-objective Model
\[\begin{align}\min \>& \color{darkred}z_1 = \max_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ \min\>& \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

Of course we need to reformulate the \(\max\) function to make it linear. The multi-objective structure was implemented as:


  1. Solve problem with the MIN MAX objective\[\begin{align} \min\>&z_1\\ &z_1 \ge d_{i,j} x_{i,j} &&\forall i\lt j \\ & \sum_{j|j \gt i} x_{i,j} + \sum_{j|j \lt i} x_{j,i} = 1 \\ &x_{i,j} \in \{0,1\} \end{align}\] Let \(z_1^*\) be the optimal objective.
  2. Solve problem with the MIN SUM, subject to MIN MAX objective is not deteriorating (i.e. the length of each segment is less than the maximum length found in the previous step). \[\begin{align}\min\>& z_2 = \sum_{i,j|i \lt j} d_{i,j} x_{i,j} \\ & d_{i,j} x_{i,j} \le z_1^* && \forall i\lt j\\ & \sum_{j|j \gt i} x_{i,j} + \sum_{j|j \lt i} x_{j,i} = 1 && \forall i\\ & x_{i,j} \in \{0,1\}\end{align}\] 

This is typically called the lexicographic method or preemptive method.

The results look like:

Optimal multi-objective solution


We see a much better distribution of the lengths with this approach. The underlying reason is that the MIN MAX solution is not unique. There are a ton of solutions with all lengths less than the MIN-MAX objective (at least 50k: I stopped after reaching this number). With the second objective, we choose the "best" of all these solutions.

Here is a summary of the results:

ModelObjectiveSumMax
MIN SUM241.267241.26738.768
MIN MAX22.954403.26522.954
MULTI-OBJECTIVE (second solve)247.382247.38222.954

We can see that:

  • The largest segment is very long for the MIN SUM model
  • The sum is very large for the MIN MAX model.
  • The MULTI model has a sum only a bit larger than from the MIN SUM model, and the largest length is identical to the MIN MAX model. It prevents the disadvantages of the MIN SUM and MIN MAX models.

A.4. MIN SUM followed by MIN MAX


The problem formed by first solving the MIN SUM problem followed by the MIN MAX model is not of much value: we get the same solution as the MIN-SUM problem. The reason is that the MIN-SUM solution is unique. That means that running the second model does not yield a different solution.

A.5. A weighted sum of MIN SUM and MIN MAX


An alternative to the preemptive or lexicographic method is to use a composite objective with some weights:

Multi-objective Model via Weighted Sum
\[\begin{align}\min \>&  \color{darkblue}w_1 \color{darkred}z_1 +  \color{darkblue}w_2 \color{darkred}z_2 \\ & \color{darkred}z_1 \ge   \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall i,j| i \lt j \\ & \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The convention is we use \(w_i\gt 0\) and \(\sum_i w_i=1\). If we use\[\begin{cases} w_1 = 0.01 \\ w_2 = 0.99\end{cases}\] we get the same solution as in section A.3 MIN ABS followed by MIN SUM.

B. Finding All Efficient Solutions


In section A we looked at characterizing the "optimal" solution by a single point. When we have multiple objectives (min-sum and min-max), it can be insightful to looks at all "optimal" points or a representative subset. An optimal solution in this context can be defined as being non-dominated by another solution. Such a solution is called efficient. The collection of all efficient solutions is called the efficient frontier. 

B.1. Weighted sum loop


A first method is to vary the weights and solve for each weight. We don't want weights of zero (or one), so I generated the following step values:


----    135 PARAMETER step  

step0  0.010,    step1  0.050,    step2  0.100,    step3  0.150,    step4  0.200,    step5  0.250,    step6  0.300
step7  0.350,    step8  0.400,    step9  0.450,    step10 0.500,    step11 0.550,    step12 0.600,    step13 0.650
step14 0.700,    step15 0.750,    step16 0.800,    step17 0.850,    step18 0.900,    step19 0.950,    step20 0.990

If we solve our weighted sum model for these values, we get the following report:


----    147 PARAMETER front  

                w1          w2         sum         max         obj

step0        0.010       0.990     247.382      22.954      25.199
step1        0.050       0.950     247.382      22.954      34.176
step2        0.100       0.900     247.382      22.954      45.397
step3        0.150       0.850     247.382      22.954      56.619
step4        0.200       0.800     247.382      22.954      67.840
step5        0.250       0.750     247.382      22.954      79.061
step6        0.300       0.700     247.382      22.954      90.283
step7        0.350       0.650     247.382      22.954     101.504
step8        0.400       0.600     247.382      22.954     112.726
step9        0.450       0.550     247.382      22.954     123.947
step10       0.500       0.500     247.382      22.954     135.168
step11       0.550       0.450     247.382      22.954     146.390
step12       0.600       0.400     247.382      22.954     157.611
step13       0.650       0.350     247.382      22.954     168.833
step14       0.700       0.300     247.382      22.954     180.054
step15       0.750       0.250     241.267      38.768     190.642
step16       0.800       0.200     241.267      38.768     200.767
step17       0.850       0.150     241.267      38.768     210.892
step18       0.900       0.100     241.267      38.768     221.017
step19       0.950       0.050     241.267      38.768     231.142
step20       0.990       0.010     241.267      38.768     239.242

This is a bit disappointing: we only observe two different solutions. And we saw these before.

We may miss some efficient (i.e., non-dominated) solutions. In general, the weighted sum method suffers from two problems:

  1. We may miss some efficient solutions. Although each point found with the weighted sum method is non-dominated, the other way around is not true: there may be efficient solutions we cannot see with this method.
  2. If this method finds several efficient points, these points may not be nicely distributed.
It is noted that in our simple loop we can miss efficient solutions for two reasons:

  • The inherent problem of the weighted sum model: it can miss solutions.
  • The way we set up the loop: we only looked at 20 different weights. When we increase this, we may see more non-dominated solutions.

There is another method we can use.

B.2. Find non-dominated solutions


We can actively look for non-dominated solutions as follows:

  1. Start with arbitrary weights \(w_1, w_2 \in (0,1)\) to find a first solution on the efficient frontier. E.g. \(w_1 = w_2 = 0.5.\)
  2. Add to the model constraints on the objective values \(z_1, z_2\) such that a new solution is better in one of the objectives than any other found efficient solution
  3. Also, add a cut to the model that forbids the current solution.
  4. Solve. If infeasible, STOP. We are done.
  5. Go to step 2.

The model solved at each step \(K\) looks like:


MIP Model in search for non-dominated solutions
\[\begin{align}\min \>& \color{darkblue}w_1 \color{darkred}z_1 + \color{darkblue}w_2 \color{darkred}z_2 \\ & \color{darkred}z_1 \ge \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall i \lt j \\ & \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i \\ & \color{darkred}z_1 \le \color{darkblue}z_1^{(k)} - \color{darkblue}\epsilon + \color{darkblue}M_1 \color{darkred}\delta && k=1,\dots,K-1\\ & \color{darkred}z_2 \le \color{darkblue}z_2^{(k)} - \color{darkblue}\epsilon + \color{darkblue}M_2 (1-\color{darkred}\delta) && k=1,\dots,K-1 \\ & \sum_{i,j|i\lt j} (2 \color{darkblue}x^{(k)}_{i,j}-1) \color{darkred}x_{i,j} \le \sum_{i,j|i\lt j} \color{darkblue}x^{(k)}_{i,j}-1 && k=1,\dots,K-1 \\ & \color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}\delta \in \{0,1\} \end{align} \]

Here \(z^{(k)}\) and \(x^{(k)}\) indicate solutions found in earlier iterations. \(\epsilon\) is a small constant to make sure we are actually improving an objective. Good values for \(M\) can be found using our earlier optimization models. E.g.: \[\begin{align} &M_1 = 38.768 - 22.945 + \epsilon \\
& M_2 = 247.382 - 241.267 + \epsilon\end{align}\] The constraints \[\begin{align} &z_1 \le  z_1^{(k)} - \epsilon + M_1 \delta\\ &z_2 \le z_2^{(k)} - \epsilon + M_2 (1-\delta)\end{align}\] say: one of the objectives should be (strictly) better compared to any other point we have found earlier. Finally, the constraint: \[\sum_{i,j|i\lt j} (2 x^{(k)}_{i,j}-1) x_{i,j} \le \sum_{i,j|i\lt j} x^{(k)}_{i,j}-1\] forbids any previously found point to be considered again.

When we run this algorithm, we find the following solutions:


----    228 PARAMETER front2  

               sum         max         obj

point1   247.38237    22.95441   135.16839
point2   243.76532    33.90486   138.83509
point3   241.26672    38.76785   140.01729

We see that point2 is a newly found efficient solution. In addition, we only need to solve 4 models (the last model, not shown in the results, is infeasible). This model is substantially more complicated than the previous loop with different weights. But we get something in return for this additional complexity: this algorithm produces all efficient points and depending on the number of non-dominated solutions, may need fewer models to solve.

Conclusion


In practical models, just what constitutes an optimal solution, is not always easy to define. An overall best solution may hurt some individuals disproportional. In our simple example, we saw a MIN SUM solution contained a few very bad matches. Trying to fix this with a multi-objective approach requires some thought. And this may give rise to new questions such as: give me all (or some subset) of the non-dominated solutions. This again is not an easy question.

References


Tuesday, January 28, 2020

How to model y=min(x1,x2)


Introduction


The problem of modeling \(y=\min(x_1,x_2)\) in optimization models is not completely trivial. Here are some notes.

Small example model


To experiment, here is a small test model:

MINLP Model
\[\begin{align} \max\>& \color{darkred} z= \color{darkred}x_1 + 2\color{darkred}x_2\\ & 2\color{darkred}x_1 + \color{darkred}x_2 = 5 + \min(\color{darkred}x_1,\color{darkred}x_2)\\ & \color{darkred}x_1,\color{darkred}x_2 \in [0,4] \end{align}\]


Global MINLP solvers


Interestingly the global MINLP solvers Baron, Couenne, and Antigone (under GAMS) have no direct facilities for this function:


SolverError message
Baron*** Cannot handle function 'min'
CouenneGams function min not supported.
AntigoneERROR: Unsupported function min in equation e0


This is unfortunate. I think it would be better if this function was supported directly. Of course, we are not defeated that easily. As the solvers do support the \(\mathrm{abs}()\) function, lets see if we can use that instead. A possible reformulation is to replace \(\min(x_1,x_2)\) by \[\min(x_1,x_2)=\frac{x_1+x_2-|x_1-x_2|}{2}\] The derivation is as follows: \[\frac{x_1+x_2-|x_1-x_2|}{2} = \begin{cases}\displaystyle\frac{x_1+x_2-x_1+x_2}{2}=x_2 & \text{if $x_1\ge x_2$}\\ \displaystyle\frac{x_1+x_2+x_1-x_2}{2}=x_1 & \text{if $x_1\lt x_2$}\end{cases}\] Using this we see:

                           LOWER          LEVEL          UPPER

---- VAR z                 -INF            9.0000        +INF         
---- VAR x1                  .             1.0000         4.0000      
---- VAR x2                  .             4.0000         4.0000      


MIP Formulation


The construct \(y=\min(x_1,x_2)\) can be interpreted as \[\begin{align} & y\le x_1 \> \mathbf{and}\> y\le x_2\\ & y\ge x_1 \>\mathbf{or}\> y\ge x_2 \end{align}\] With this, we can form the set of constraints: 


MIP big-M reformulation of min function
\[\begin{align} & \color{darkred} y \le \color{darkred}x_1 \\ & \color{darkred} y \le \color{darkred}x_2\\ & \color{darkred} y \ge \color{darkred}x_1 - \color{darkblue}M \color{darkred}\delta \\ & \color{darkred} y \ge \color{darkred}x_2 - \color{darkblue}M (1-\color{darkred}\delta)\\ & \color{darkred}\delta \in \{0,1\} \end{align}\]

The main problem here is that we need a good value for \(M\). In this case, \(M\) should be the largest difference between \(x_1\) and \(x_2\) we can observe. In our case \(x_1,x_2\in[0,4]\) so a good value would be \(M=4\). The solution can look like:

                           LOWER          LEVEL          UPPER         MARGINAL

---- VAR z                 -INF            9.0000        +INF             .          
---- VAR x1                  .             1.0000         4.0000          .          
---- VAR x2                  .             4.0000         4.0000         1.0000      
---- VAR y                   .             1.0000        +INF             .          
---- VAR delta               .              .             1.0000         EPS         

It may not always be so easy to derive good values. In that case, an alternative is to use SOS1 variables (SOS1 stands for Special Ordered Set of type 1). This type of variable is supported by most MIP solvers (but not all).

MIP SOS1 reformulation of the min function
\[\begin{align} & \color{darkred} y \le \color{darkred}x_1 \\ & \color{darkred} y \le \color{darkred}x_2\\ & \color{darkred} y \ge \color{darkred}x_1 - \color{darkred}s_1 \\ & \color{darkred} y \ge \color{darkred}x_2 - \color{darkred}s_2\\ & \color{darkred}s_1,\color{darkred}s_2 \ge 0 \\ & (\color{darkred}s_1,\color{darkred}s_1) \in SOS1\end{align}\]

The SOS1 condition says: only one of the variables \((s_1,s_2)\) can be non-zero (note that they can be both zero). This means: if one of the variables is non-zero, the other one will be zero. As you can see, there are no bounds needed on \(s_i\), so this formulation is useful when we can not find a good value for \(M\) in the previous formulation.

I want to mention that some solvers have a built-in function for \(\min()\). That makes life easier, as there is no reformulation required (the solver will do this behind the scenes).

How not to handle the min function


I see often the following suggestion to handle the above model:

Don't do this
\[\begin{align} \max\>& \color{darkred} z= \color{darkred}x_1 + 2\color{darkred}x_2 + \color{darkblue} \alpha \cdot\color{darkred}y \\ & 2\color{darkred}x_1 + \color{darkred}x_2 = 5 + \color{darkred}y \\ & \color{darkred}y \le \color{darkred}x_1 \\ & \color{darkred}y \le \color{darkred}x_2\\ & \color{darkred}x_1,\color{darkred}x_2 \in [0,4] \end{align}\]

Here \(\alpha\gt 0\) is a constant. The idea is that the objective will now push \(y\) upwards, so we don't need to worry anymore about \[y \ge x_1 \> \mathbf{or} \> y \ge x_2\] The problem is that we are changing the problem. Indeed the results (with \(\alpha=10\)) look like:


                           LOWER          LEVEL          UPPER         MARGINAL

---- VAR z                 -INF           32.5000        +INF             .          
---- VAR x1                  .             2.5000         4.0000          .          
---- VAR x2                  .             2.5000         4.0000          .          
---- VAR y                   .             2.5000        +INF             .          

This means the objective \(z_0=x_1+2x_2\) is equal to 7.5 which is less than the objective we saw earlier (we had before \(z=9\)). So this trick is just not the right thing to do.

There are values for \(\alpha\) that actually work. Using \(\alpha=0.1\) we observe:


                           LOWER          LEVEL          UPPER         MARGINAL

---- VAR z                 -INF            9.1000        +INF             .          
---- VAR x1                  .             1.0000         4.0000          .          
---- VAR x2                  .             4.0000         4.0000         0.9000      
---- VAR y                   .             1.0000        +INF             .          

This corresponds to the solution we found earlier. With some experimentation it looks like we have: \[\begin{cases} \alpha \le 1 \Rightarrow x_1=1,x_2=4\\ \alpha \gt 1 \Rightarrow x_1,x_2=2.5\end{cases}\] I am not sure if in general there is an easy way to find the threshold or even if there is always some value of \(\alpha\) that works. I don't particularly like this approach as it really changes the problem. As a result, I never use this method.

Monday, January 20, 2020

A scheduling problem


I have a job scheduling problem with a twist- a minimization constraint. The task is- I have many jobs, each with various dependencies on other jobs, without cycles. These jobs have categories as well, and can be ran together in parallel for free if they belong to the same category. So, I want to order the jobs so that each job comes after its dependencies, but arranged in such a way that they are grouped by category (to run many in parallel) to minimize the number of serial jobs I run. That is, adjacent jobs of the same category count as a single serial job [1].
It is always difficult to form a model from an informal description. For me, it often helps to develop some example data. So here we go:


----     28 SET j  jobs

job1 ,    job2 ,    job3 ,    job4 ,    job5 ,    job6 ,    job7 ,    job8 ,    job9 ,    job10,    job11,    job12
job13,    job14,    job15,    job16,    job17,    job18,    job19,    job20,    job21,    job22,    job23,    job24
job25,    job26,    job27,    job28,    job29,    job30,    job31,    job32,    job33,    job34,    job35,    job36
job37,    job38,    job39,    job40,    job41,    job42,    job43,    job44,    job45,    job46,    job47,    job48
job49,    job50


----     28 SET c  category

cat1,    cat2,    cat3,    cat4,    cat5


----     28 SET jc  job-category mapping

             cat1        cat2        cat3        cat4        cat5

job1          YES
job2                                                          YES
job3                                  YES
job4                      YES
job5                      YES
job6                      YES
job7                      YES
job8                                                          YES
job9          YES
job10                                 YES
job11                                                         YES
job12                                 YES
job13                                                         YES
job14                                             YES
job15         YES
job16                                             YES
job17         YES
job18                     YES
job19                                             YES
job20                                 YES
job21                     YES
job22                     YES
job23         YES
job24         YES
job25                                 YES
job26                                                         YES
job27                     YES
job28                                             YES
job29                                             YES
job30                     YES
job31         YES
job32                                 YES
job33         YES
job34                                                         YES
job35                     YES
job36                     YES
job37                                 YES
job38                                             YES
job39                                             YES
job40                                 YES
job41                                 YES
job42         YES
job43                     YES
job44         YES
job45                     YES
job46         YES
job47                                             YES
job48                                 YES
job49                                             YES
job50                     YES


----     28 PARAMETER length  job duration

job1  11.611,    job2  12.558,    job3  11.274,    job4   7.839,    job5   5.864,    job6   6.025,    job7  11.413
job8  10.453,    job9   5.315,    job10 12.924,    job11  5.728,    job12  6.757,    job13 10.256,    job14 12.502
job15  6.781,    job16  5.341,    job17 10.851,    job18 11.212,    job19  8.894,    job20  8.587,    job21  7.430
job22  7.464,    job23  6.305,    job24 14.334,    job25  8.799,    job26 12.834,    job27  8.000,    job28  6.255
job29 12.489,    job30  5.692,    job31  7.020,    job32  5.051,    job33  7.696,    job34  9.999,    job35  6.513
job36  6.742,    job37  8.306,    job38  8.169,    job39  8.221,    job40 14.640,    job41 14.936,    job42  8.699
job43  8.729,    job44 12.720,    job45  8.967,    job46 14.131,    job47  6.196,    job48 12.355,    job49  5.554
job50 10.763


----     28 SET before  dependencies

             job3        job9       job13       job21       job23       job27       job32       job41       job42

job1          YES
job3                      YES
job4                                  YES
job8                                                                      YES
job9                                              YES         YES
job12                                                         YES
job14                                                                                             YES
job21                                                                     YES
job26                                                                                                         YES
job31                                                                                 YES

    +       job43       job46       job48

job10                     YES         YES
job11         YES


----     28 PARAMETER due  some jobs have a due date

job16 50.756,    job19 57.757,    job20 58.797,    job25 74.443,    job29 65.605,    job32 55.928,    job50 58.012

So we have 50 jobs and 5 categories. The set \(jc_{j,c}\) indicates if job \(j\) corresponds to category \(c\). Jobs have a duration or processing time, and some jobs have a precedence structure: the set \(\mathit{before}_{i,j}\) indicates that job \(i\) has to executed before job \(j\). Besides, some jobs have a due date.

The way I generated the due dates and the processing times suggests a continuous time model. The main variables are: \[\begin{cases} \mathit{start}_j  \ge 0& \text{start of job $j$}\\ \mathit{end}_j \ge 0& \text{end of job $j$}\\  \delta_{i,j} \in \{0,1\} & \text{binary variable used to model no-overlap constraints}\end{cases}\]

As the objective function, I use: minimize the total makespan. The idea is that this will automatically try to use as much as possible parallel jobs of the same category. We have the following constraints:

  1. A job finishes at start time plus job length.
  2. For some jobs, we have a due date. This becomes an upper bound on the end variable in the model. In practical models, you may want to allow a job to finish later than the due but at a cost. This would make it easier to generate meaningful results if not all due dates can be met.
  3. The precedence constraints are simple: job \(j\) can not start before job \(i\) has finished.
  4. No-overlap constraints require some thought. We use a binary variable to indicate that job \(i\) is executed before or after job \(j\). This has to hold for a subset of \((i,j)\) combinations: (a) only if \(i\lt j\): we don't want to check a pair twice, (b) only if there is no precedence constraint already in effect, and (c) only if jobs are of a different category. We can store the results of these three conditions in a set \(\mathit{NoOverlap}_{i,j}\) which indicates which elements \((i,j)\) need the no-overlap constraints. This set will be used in the model below.
With this in mind, we can formalize this as:


Mixed Integer Programming Model
\[\begin{align} \min\> & \color{darkred}{\mathit{makespan}}\\ & \color{darkred}{\mathit{makespan}} \ge \color{darkred}{\mathit{end}}_j && \forall j\\ & \color{darkred}{\mathit{end}}_j = \color{darkred}{\mathit{start}}_j + \color{darkblue}{\mathit{length}}_j&& \forall j \\ & \color{darkred}{\mathit{end}}_j \le \color{darkblue}{\mathit{due}}_j && \forall j | \color{darkblue}{\mathit{due}}_j \text{ exists}\\ & \color{darkred}{\mathit{end}}_i \le \color{darkred}{\mathit{start}}_j && \forall i,j|\color{darkblue}{\mathit{Precedence}}_{i,j} \text{ exists}\\ & \color{darkred}{\mathit{end}}_i \le \color{darkred}{\mathit{start}}_j +\color{darkblue} M \color{darkred}\delta_{i,j} && \forall i,j|\color{darkblue}{\mathit{NoOverlap}}_{i,j} \\ &\color{darkred}{\mathit{end}}_j \le \color{darkred}{\mathit{start}}_i +\color{darkblue} M (1-\color{darkred}\delta_{i,j}) && \forall i,j|\color{darkblue}{\mathit{NoOverlap}}_{i,j} \\ & \color{darkred}{\mathit{start}}_j, \color{darkred}{\mathit{end}}_j \ge 0 \\ & \color{darkred}\delta_{i,j} \in \{0,1\}\end{align} \]


Here \(M\) is a large enough constant, e.g., the planning window. There are many variations on this model (Constraint Programming formulations, alternative modeling to prevent the big-M constructs, such as SOS1 variables or indicator constraints). Still, as the first line of attack, this is not too bad. This formulation is similar to the classic job-shop formulation by Alan Manne [1]. The main issue with this model is that some thought has gone into developing the set \(\mathit{NoOverlap}_{i,j}\), which indicates which combinations of jobs \((i,j)\) we need to protect against being executed at the same time.

We can substitute out \(\mathit{start}_j\) or \(\mathit{end}_j\). I would prefer to keep them both in the model to improve readability.

The model solves quickly (30 seconds on my laptop). It is not very large for modern solvers:


MODEL STATISTICS

BLOCKS OF EQUATIONS           5     SINGLE EQUATIONS        2,058
BLOCKS OF VARIABLES           4     SINGLE VARIABLES        1,073
NON ZERO ELEMENTS         6,060     DISCRETE VARIABLES        972


When we solve this, we get the following results:


----     67 VARIABLE start.L  start of job

job1  36.099,    job2  23.541,    job3  61.714,    job4   2.924,    job7  87.323,    job8  25.646,    job9  72.989
job10 47.710,    job11 23.265,    job12 47.710,    job13 23.265,    job14 10.763,    job15 36.099,    job16 10.763
job17 36.859,    job18 87.323,    job19 10.763,    job20 47.710,    job21 87.323,    job22 87.323,    job23 78.304
job24 72.989,    job25 47.710,    job26 23.265,    job27 94.753,    job28 10.763,    job29 10.776,    job31 36.099
job32 47.710,    job33 36.099,    job34 23.265,    job37 47.710,    job38 10.763,    job39 10.763,    job40 47.710
job41 47.710,    job42 36.099,    job43 87.323,    job44 72.989,    job46 73.192,    job47 10.763,    job48 60.634
job49 10.763


----     67 VARIABLE end.L  end of job

job1   47.710,    job2   36.099,    job3   72.989,    job4   10.763,    job5    5.864,    job6    6.025
job7   98.736,    job8   36.099,    job9   78.304,    job10  60.634,    job11  28.993,    job12  54.467
job13  33.521,    job14  23.265,    job15  42.880,    job16  16.104,    job17  47.710,    job18  98.535
job19  19.657,    job20  56.297,    job21  94.753,    job22  94.787,    job23  84.609,    job24  87.323
job25  56.510,    job26  36.099,    job27 102.754,    job28  17.018,    job29  23.265,    job30   5.692
job31  43.119,    job32  52.761,    job33  43.795,    job34  33.264,    job35   6.513,    job36   6.742
job37  56.017,    job38  18.932,    job39  18.984,    job40  62.350,    job41  62.646,    job42  44.798
job43  96.052,    job44  85.708,    job45   8.967,    job46  87.323,    job47  16.959,    job48  72.989
job49  16.317,    job50  10.763

(Some jobs have a start time of zero. They are not printed here.)

For scheduling models, this type of output is difficult to interpret by humans. Better is something like a Gantt chart:

Results for example data set


Indeed, we see that many jobs of the same category are executed in parallel. We have two category 1 and 2 periods. This is due to due dates and precedence constraints. (If we did not have due dates or precedence constraints, we would see just a single region for each category.) Also, we understand that some jobs have a little bit of freedom to float within a window. For instance, job 4 could have started a bit earlier. We can fix this in post-processing or tell the model to move jobs to the left when possible (this would mean an additional term in the objective).

Note that commercial solvers have no problems with this model and data set.  For instance Cplex shows:

Root node processing (before b&c):
  Real time             =    3.16 sec. (583.75 ticks)
Parallel b&c, 8 threads:
  Real time             =   33.03 sec. (16153.92 ticks)
  Sync time (average)   =    4.97 sec.
  Wait time (average)   =    0.01 sec.
                          ------------
Total (root+branch&cut) =   36.19 sec. (16737.66 ticks)
MIP status(101): integer optimal solution
Cplex Time: 36.20sec (det. 16737.67 ticks)

However, I see more problems with open source and academic code. E.g., SCIP on NEOS shows:

SCIP Status        : problem is solved [optimal solution found]
Solving Time (sec) : 13010.25
Solving Nodes      : 5602808
Primal Bound       : +1.02753723219890e+02 (78 solutions)
Dual Bound         : +1.02753723219890e+02
Gap                : 0.00 %

This is quite a dramatic difference.

References


  1. Job scheduling with minimization by parallel grouping, https://stackoverflow.com/questions/59722383/job-scheduling-with-minimization-by-parallel-grouping
  2. Alan Manne, On the Job-Shop Scheduling Problem, Operations Research, 1960, vol. 8, issue, pp. 219-223. 

Saturday, January 11, 2020

MIP vs greedy search


Problem


Assume we have \(N\) points (with \(N\) even). Find \(N/2\) pairs of points, such that the sum of the lengths of the line segments based on these pairs, is minimized. This looks like an assignment problem where we do not know in advance the partition of the nodes in two equally sized sets. This problem is sometimes called a maximum matching on graphs

A picture is probably better than my arduous description:

Optimal MIP solution: sum of the lengths is minimized

Let's try a MIP model.

MIP Formulation


The first thing to do is to calculate a distance matrix between points \(i\) and \(j\). As this matrix is symmetric, we only need to store the upper-triangular part (with \(i\lt j\)). That means we only need to store a little bit less than half the number of entries: \[\mathit{ndist}=\frac{1}{2}N(N-1)\] 

If we use as variables \[x_{i,j} = \begin{cases} 1 & \text{if nodes $i$ and $j$ are connected}\\ 0 & \text{otherwise}\end{cases}\] we need to think a bit about symmetry. If nodes 1 and 2 are connected, then we only want to see \(x_{1,2}=1\) while ignoring \(x_{2,1}=1\). So again, we only consider the variable \(x_{i,j}\) when \(i \lt j\). Again, this saves about half the number of variables we otherwise would use.

The model can look like: 

MIP Model
\[\begin{align}\min& \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The above picture is the optimal solution from this MIP model.

The constraint looks a bit strange. It basically says: every node \(i\) is either a start- or an end-point of a single segment. Somewhat surprising, at first sight, is that this is enough to characterize our graph. Not many MIP models have just one single constraint block.


Sidenote

We can see in a picture how the constraint for \(i=4\) is formed:


The blue zone must contain exactly one 1. If we would pick \(j=7\), i.e. \(x_{4,7}=1\), the picture becomes:


It turns out this MIP model is fairly easy to solve. The example problem with \(N=50\) points solves in a matter of seconds. For \(N=500\) points, we get a large MIP with 124,750 binary variables and 500 constraints. Cplex solves this model quickly to optimality in 90 seconds on my laptop.

Greedy heuristic


An obvious greedy heuristic is as follows:

  1. Find the pair with the shortest distance \[\min_{i \lt j} d_{i,j}\] and record this segment.
  2. Remove the two nodes that belong to this segment from the problem.
  3. If there are still nodes left, go to step 1.

This heuristic is exceedingly simple to implement. Some results for our 50 point example:

MIP vs Greedy
PointsOptimal MIP ObjectiveGreedy Objective
50241.2667339.236

The greedy algorithm is really astonishingly bad. The animation below shows why: with a simple greedy algorithm we are doing very good initially, but pay the price in the end.

Greedy algorithm in action


The picture on the left shows how my greedy algorithm selects shortest possible segment. The last few ones are really long. The right picture shows the lengths of the segment compared to the ones for the optimal MIP solution. The blue bars are formed by sorting the MIP solution. Until just before the end, things look honky-dory for the greedy algorithm. But the last few segments are really bad.

We can distinguish three areas:

  • For the first, very small segments, the MIP solution and the greedy algorithm pick the same pairs. No difference.
  • In the middle, the greedy algorithm is doing a little bit better: it picks shorter segments than the MIP solver.
  • But at the end, the greedy algorithm totally collapses: it has to choose very long segments.

Improvement heuristic


We can follow the construction heuristic with an improvement heuristic. One simple one is just to consider two segments, and consider swapping them as follows:


If one of the alternatives has shorter segments, we perform the swap. We repeat this until no more improvement can be found.

Updating my plotting script leads to:

An improvement heuristic

MIP vs Greedy+Improvement
AlgorithmObjective
MIP Model (optimal)241.27
Greedy339.24
Improvement269.66

This leads to much better results. The final sum of lengths is still a bit longer than optimal, but we actually have a shorter maximum length (again compared to optimal). The very long segments are gone. The difference compared to the optimal objective went from 40% (greedy algorithm) to 12% (improvement algorithm).

Notes:

  • The improvement algorithm depends on the initial configuration. It may help to repeat it with a different (e.g. random) initial solution.
  • This looks a bit like TSP nearest-neighbor and 2-opt heuristics.
  • The MIP model only looks at minimizing the sum. We could alternatively minimizing the max, or using a combination of these two objectives. It is not unusual in practical models to combine min-sum with min-max.
  • Needless to say the animations are not real time: lots of delays in order to be able to see what is happening. 

This section was added later to this post.

References


  1. Finding minimum total length of line segments to connect 2N points, https://stackoverflow.com/questions/59684772/finding-minimum-total-length-of-line-segments-to-connect-2n-points
  2. Blossom Algorithm, https://en.wikipedia.org/wiki/Blossom_algorithm. Algorithm for matching problems.

Thursday, January 9, 2020

Preferences in assignments

Problem


The problem is as follows [1]:

  • We have \(N\) persons.
  • They have to be assigned to \(M\) groups.
  • Each group has a certain, given, size.
  • Each person has certain preferences to be in the same group with some other persons. For this, each person provides a list of persons (s)he would like or dislike to be in the same group with.
  • Design an assignment plan that takes these preferences into account.
If we don't allow empty spots in groups, we have the following condition on the data: \[\sum_g \mathit{size}_g = N\]



Data


The data is provided as a fancy spreadsheet:



We see we have \(N=28\) persons.

The matrix with preferences has three different colored symbols:

  • Yellow exclamation mark for 0
  • Green tick mark for +1
  • and a red x for -1

The matrix is not symmetric. E.g. there are some columns with all zero entries. Such rows do not exist. We need to read the matrix row wise: each row represents the preferences of a person. It is noted that the diagonal has all zeros, which is what I would expect.

For the problem, I want to have the total group capacity equal to \(N=28\), so I consider:

  • 4 groups of 6
  • 1 group of 4  


High-level Model


A first attempt to model this problem is as follows. We use: \[x_{p,g} = \begin{cases} 1 & \text{if person $p$ is assigned to group $g$} \\  0 & \text{otherwise}\end{cases} \] as our decision variable. The objective is to maximize the total number of preferences \(+1\) or \(-1\) that are honored. With this we can write:

Quadratic Model
\[\begin{align}\max & \sum_{p_1,p_2,g} \color{darkblue}{\mathit{Pref}}_{p_1,p_2} \color{darkred} x_{p_1,g} \color{darkred} x_{p_2,g} \\ & \sum_g \color{darkred}x_{p,g}=1 & \forall p \\ & \sum_p \color{darkred}x_{p,g} = \color{darkblue}{\mathit{Size}}_{g} &\forall g \\ & \color{darkred} x_{p,g} \in \{0,1\}\end{align}\]

We can feed this into an MIQP solver. The solution looks like:


----     80 PARAMETER solution  using MIQP model

               group1      group2      group3      group4      group5

aimee               1
amber-la                                                1
amber-le                                                            1
andrina             1
catelyn-t                                   1
charlie                                                 1
charlotte                                   1
cory                            1
daniel                          1
ellie               1
ellis               1
eve                                         1
grace-c                                                 1
grace-g                                                 1
holly                                                   1
jack                            1
jade                                                                1
james                           1
kadie                                       1
kieran                                                              1
kristiana                                   1
lily                                                                1
luke                            1
naz                 1
nibah                                       1
niko                            1
wiki                1
zeina                                                   1
COUNT               6           6           6           6           4

Linearization


We can linearize this model in different ways:
  • Let the solver reformulate. Cplex, for instance, will automatically reformulate this problem and will actually solve this as a linear MIP.
  • DIY. We can linearize the products: \[y_{p_1,p_2,g} = x_{p_1,g}\cdot x_{p_2,g}\] using \[\begin{align} & y_{p_1,p_2,g} \le x_{p_1,g} \\& y_{p_1,p_2,g} \le x_{p_2,g} \\ & y_{p_1,p_2,g} \ge x_{p_1,g}+x_{p_2,g}-1 \end{align} \]  We can save a bit by only using \(y_{p_1,p_2,g}\) for \(p_1 \le p_2\). This requires a bit of careful modeling. Let's give this a try.

Linear Model I
\[\begin{align}\max & \sum_{p_1\lt p_2,g} (\color{darkblue}{\mathit{Pref}}_{p_1,p_2}+\color{darkblue}{\mathit{Pref}}_{p_2,p_1}) \color{darkred} y_{p_1,p_2,g}  \\ & \sum_g \color{darkred}x_{p,g}=1 && \forall p \\ & \sum_p \color{darkred}x_{p,g} = \color{darkblue}{\mathit{Size}}_{g} && \forall g \\ & \color{darkred}y_{p_1,p_2,g} \le \color{darkred}x_{p_1,g} && \forall p_1\lt p_2, g \\ & \color{darkred}y_{p_1,p_2,g} \le \color{darkred}x_{p_2,g} && \forall p_1\lt p_2, g \\ &\color{darkred}y_{p_1,p_2,g} \ge \color{darkred}x_{p_1,g}+\color{darkred}x_{p_2,g} -1&&   \forall p_1\lt p_2, g\\ & \color{darkred} x_{p,g} \in \{0,1\} \\&\color{darkred}y_{p_1,p_2,g} \in [0,1] && \forall p_1\lt p_2, g \end{align}\]


There is one extra optimization, we can do. If the objective coefficient is positive, we can drop the constraint \(y_{p_1,p_2,g}  \ge x_{p_1,g}+x_{p_2,g}-1\). Similarly, if the objective coefficient is negative, we can drop the constraints \(y_{p_1,p_2,g} \le x_{p_1,g}\) and \(y_{p_1,p_2,g} \le x_{p_1,g}\). This leads to:

Linear Model II
\[\begin{align}\max & \sum_{p_1\lt p_2,g} (\color{darkblue}{\mathit{Pref}}_{p_1,p_2}+\color{darkblue}{\mathit{Pref}}_{p_2,p_1}) \color{darkred} y_{p_1,p_2,g}  \\ & \sum_g \color{darkred}x_{p,g}=1 && \forall p \\ & \sum_p \color{darkred}x_{p,g} = \color{darkblue}{\mathit{Size}}_{g} && \forall g \\ & \color{darkred}y_{p_1,p_2,g} \le \color{darkred}x_{p_1,g} && \forall p_1, p_2, g | p_1\lt p_2, \color{darkblue}{\mathit{Pref}}_{p_1,p_2}+\color{darkblue}{\mathit{Pref}}_{p_2,p_1} \gt 0 \\ & \color{darkred}y_{p_1,p_2,g} \le \color{darkred}x_{p_2,g} && \forall p_1, p_2, g | p_1\lt p_2, \color{darkblue}{\mathit{Pref}}_{p_1,p_2}+\color{darkblue}{\mathit{Pref}}_{p_2,p_1} \gt 0\\ &\color{darkred}y_{p_1,p_2,g} \ge \color{darkred}x_{p_1,g}+\color{darkred}x_{p_2,g} -1&&   \forall p_1,p_2,g | p_1\lt p_2, \color{darkblue}{\mathit{Pref}}_{p_1,p_2}+\color{darkblue}{\mathit{Pref}}_{p_2,p_1} \lt 0\\ & \color{darkred} x_{p,g} \in \{0,1\} \\&\color{darkred}y_{p_1,p_2,g} \in [0,1] && \forall p_1\lt p_2, g \end{align}\]


Some computational results:

modelMIQPMIP IMIP II
objective848484
iterations418416702640262
nodes822308594
time (seconds)6156

Looks like MIP II is closer to what Cplex is doing.

Equality


In this model we optimize the sum of the achieved preferences for all persons. This may actually hurt some individuals.  Let's look at the number of +1 and -1 preferences in the data and how many were honored in the solution:


----     96 PARAMETER count  preferences stated and achieved

             +1 prefs    -1 prefs  +/-1 prefs       +1 ok       -1 ok     +/-1 ok

aimee               9           3          12           4           3           7
amber-la            3           2           5           3           2           5
amber-le            2           6           8           1           6           7
andrina             5                       5           4                       4
catelyn-t           5                       5           4                       4
charlie             6           4          10                       4           4
charlotte           1                       1           1                       1
cory                4           4           8           4           4           8
daniel              4           3           7           4           3           7
ellie               6                       6           4                       4
ellis              11           1          12           3           1           4
eve                 7                       7           4                       4
grace-c             2                       2           2                       2
grace-g             4           2           6           4           2           6
holly               3           2           5           3           2           5
jack                3           8          11           3           7          10
jade                1           6           7           1           6           7
james               6           4          10           5           4           9
kadie               6           1           7           4           1           5
kieran                          3           3                       2           2
kristiana           4           2           6           4           2           6
lily                2           7           9                       7           7
luke                5           3           8           5           3           8
naz                 6           2           8           4           2           6
nibah               6                       6           4                       4
niko                4           4           8           4           4           8
wiki                9                       9           4                       4
zeina               3           2           5           3           2           5
sum               127          69         196          86          67         153

The left three columns summarize the input data. The last three columns is what has materialized in the solution.  There is some inequality here. E.g. Charlie gets only 4 of his 10 wishes, but Jack gets 10 of his 11 preferences obeyed.  It would be interesting if we can find a more equitable solution that still has a good overall performance.

Looking at the above table we can see that 86 of 127 +1 preferences were taken care of (68%) while 67 of 69 -1 preferences were obeyed (97%). There are more seats not at the table than at the table, so that asymmetry makes sense.

Notes:

  • the objective value (84) can be recovered from: \[86 - (69-67) = 84\]
  • to increase our objective, we can increase the size of the groups (and having fewer groups). For instance if we use groups of sizes 8,8,6,6, I achieve an optimal objective of 99.


A possible quality measure for each person is: \[\frac{\text{Number of preferences honored}}{\text{Number of preferences specified}}\] If we look at this, we see that ellis and charlie are doing poorly with 33% and 40%. To help prevent these bad cases, we can require that at least 50% (say) of preferences are met. For this data that would decrease the objective from 84 to 83. Actually this 50% is the best we can do.

References


  1. Algorithms for optimal student seating arrangements, https://stackoverflow.com/questions/59599718/algorithms-for-optimal-student-seating-arrangements

Saturday, January 4, 2020

Small Blending Problem in PuLP




In [1] a small blending problem is proposed.

There are different raw steel materials that have to be mixed (blended) into a final material that has certain specifications. In this case the specifications are limits on the elements Carbon (C), Copper (Cu) and Manganese (Mn). We assume things blend linearly.

Blending is a traditional linear programming application, and models are found in many text books.

The problem is small so let's try PuLP here.

I'll try to write a bit about indexing using strings (instead of integers), and compare PuLP with CVXPY and GAMS. As the model is small, I'll also add some data manipulation (using data frames) and some simple reporting (also using data frames). I use string indexing in Pulp (with GAMS this is standard). Of course CVXPY is very different: it is matrix based. So there things are position dependent. The idea it to extract data from the data frame in such a way that positions are predictable.

Problem data


The data for the problem is as follows:


Demand: 5000 Kg

Specification of final material:

    Element      %Minimum %Max   
    Carbon       2         3     
    Copper       0.4       0.6   
    Manganese    1.2       1.65  


Raw material inventory:

Alloy          C%   Cu%   Mn%     Stocks kg Price € / kg
Iron alloy     2.50 0.00  1.30    4000      1.20
Iron alloy     3.00 0.00  0.80    3000      1.50
Iron alloy     0.00 0.30  0.00    6000      0.90
Copper alloy   0.00 90.00 0.00    5000      1.30
Copper alloy   0.00 96.00 4.00    2000      1.45
Aluminum alloy 0.00 0.40  1.20    3000      1.20
Aluminum alloy 0.00 0.60  0.00   2,500      1.00


Mathematical Model


The basic model is:

Blending Model
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i\cdot \color{darkred}{\mathit{Use}}_i\\ & \color{darkblue}{\mathit{Min}}_j \le \frac{\displaystyle\sum_i \color{darkblue}{\mathit{Element}}_{i,j}\cdot \color{darkred}{\mathit{Use}}_i}{\displaystyle \sum_i \color{darkred}{\mathit{Use}}_i} \le \color{darkblue}{\mathit{Max}}_j \\ & \sum_i \color{darkred}{\mathit{Use}}_i = \color{darkblue}{\mathit{Demand}} \\ & 0 \le \color{darkred}{\mathit{Use}}_i \le \color{darkblue}{\mathit{Available}}_i \end{align} \]

Here we use: \[\begin{cases} i  & \text{Types of raw material in stock}\\ j & \text{Element with limits}\\ {\mathit{Cost}}_i & \text{Unit cost of raw material} \\ {\mathit{Min}}_j, {\mathit{Max}}_j & \text{Limits on element content in final product} \\   {\mathit{Element}}_{i,j} & \text{Content of elements in raw material}\\ {\mathit{Demand}} & \text{Demand for final product}\\ {\mathit{Available}}_i  & \text{Availability of raw material} \\  {\mathit{Use}}_i  & \text{Decision variable: how much raw material to use} \end{cases}\] The blending constraint is nonlinear: we divide by the total weight of the final product to calculate the percentages. We can linearize this fraction in two ways:

  1. multiply all sides by \(\sum_i \mathit{Use}_i\). This leads to \[\mathit{Min}_j \cdot\sum_i \mathit{Use}_i \le \sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \le \mathit{Max}_j \sum_i \mathit{Use}_i\] This first reformulation is especially useful when the total final product is not constant. Note that this formulation is sometimes difficult to recognize when rewritten as something like: \[\begin{cases} \displaystyle \sum_i (\mathit{Element}_{i,j}-\mathit{Min}_j )\mathit{Use}_i  \ge 0 \>\>\forall j \\ \displaystyle \sum_i (\mathit{Max}_j - \mathit{Element}_{i,j})\mathit{Use}_i  \ge 0 \>\>\forall j \end{cases}\] 
  2. In our case we know that \(\sum_i \mathit{Use}_i\) is constant: it is always equal to \(\mathit{Demand}\). I.e. \[ \mathit{Min}_j  \le  \frac{1}{\mathit{Demand}} \sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \le \mathit{Max}_j\]


We often need to split sandwich equations into two simple inequalities. That often leads to duplicate expressions: \[\begin{cases}  \displaystyle  \frac{1}{\mathit{Demand}}\sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \ge \mathit{Min}_j \>\>\forall j \\  \displaystyle \frac{1}{\mathit{Demand}}\sum_i \mathit{Element}_{i,j} \cdot\mathit{Use}_i \le \mathit{Max}_j  \>\>\forall j \end{cases}\] For small problems this is not an issue. For larger problems, I prefer to introduce extra variables that prevents these duplicate expressions. We end up with the following linear programming formulation:

Linear Programming Formulation
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot\color{darkred}{\mathit{Use}}_i\\ & \color{darkblue}{\mathit{Demand}} \cdot \color{darkred}{\mathit{Content}}_j = \sum_i \color{darkblue}{\mathit{Element}}_{i,j} \cdot\color{darkred}{\mathit{Use}}_i \\ & \sum_i \color{darkred}{\mathit{Use}}_i = \color{darkblue}{\mathit{Demand}} \\ & \color{darkred}{\mathit{Use}}_i \in [0, \color{darkblue}{\mathit{Available}}_i]\\ & \color{darkred}{\mathit{Content}}_j \in [\color{darkblue}{\mathit{Min}}_j,\color{darkblue}{\mathit{Max}}_j] \end{align} \]


Even for small, almost trivial models, it makes sense to first develop a mathematical model. Especially, if you are not very experienced in developing linear programming models. Starting with a pen and a piece of paper is sometimes better than immediately start coding.

Implementation in Python/Pulp


An implementation using PuLP can look like:


from io import StringIO
import pandas as pd
import pulp as lp

# for inputting tabular data below
def table(s):
  return pd.read_csv(StringIO(s),sep='\s+',index_col='ID')

#------------------------------------------------------------------
# data
#------------------------------------------------------------------

demand = 5000

requirements = table("""
   ID  Element      Min   Max
   C   Carbon       2     3
   Cu  Copper       0.4   0.6
   Mn  Manganese    1.2   1.65
    """)

supplyData = table("""
  ID  Alloy             C       Cu     Mn     Stock   Price
  A   "Iron alloy"      2.50    0.00   1.30   4000    1.20
  B   "Iron alloy"      3.00    0.00   0.80   3000    1.50
  C   "Iron alloy"      0.00    0.30   0.00   6000    0.90
  D   "Copper alloy"    0.00   90.00   0.00   5000    1.30
  E   "Copper alloy"    0.00   96.00   4.00   2000    1.45
  F   "Aluminum alloy"  0.00    0.40   1.20   3000    1.20
  G   "Aluminum alloy"  0.00    0.60   0.00   2500    1.00
  """)

print("----- Data-------")
print(requirements)
print(supplyData)


#------------------------------------------------------------------
# derived data
#------------------------------------------------------------------

# our sets are stockItems ["A","B",..] and elements ["C","Cu",...] 
Items = supplyData.index
Elements = requirements.index

print("----- Indices-------")
print(Items)
print(Elements)

#------------------------------------------------------------------
# LP Model
#------------------------------------------------------------------


use = lp.LpVariable.dicts("Use",Items,0,None,cat='Continuous')
content = lp.LpVariable.dicts("Content",Elements,0,None,cat='Continuous')

model = lp.LpProblem("Steel", lp.LpMinimize)

# objective : minimize cost
model += lp.lpSum([use[i]*supplyData.loc[i,'Price'] for i in Items ])

# upper bounds wrt availability
for i in Items:
  model += use[i] <= supplyData.loc[i,'Stock']

# final content of elements and their bounds  
for j in Elements:
  model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])
  model += content[j] >= requirements.loc[j,'Min']
  model += content[j] <= requirements.loc[j,'Max']

# meet demand
model += lp.lpSum([use[i] for i in Items]) == demand


# for debugging
#print(model)


#------------------------------------------------------------------
# Solve and reporting
#------------------------------------------------------------------

model.solve()


print("----- Model Results-------")
print("Status:", lp.LpStatus[model.status])
print("Objective:",lp.value(model.objective))


# collect results
L = []
for i in Items: 
  L.append(['use',i,0.0,use[i].varValue,supplyData.loc[i,'Stock']])
for j in Elements:
  L.append(['content',j,requirements.loc[j,'Min'],content[j].varValue,requirements.loc[j,'Max']])
results = pd.DataFrame(L,columns=['Variable','Index','Lower','Value','Upper'])
print(results)


Notes:

  • We input the basic data as data frames. Data frames are a standard way to handle tabular data. Data frames are originally from the R statistical software system.
  • Usually read_csv is for CSV files. Here we use it to read from a string. Blanks are used as separator to make the table more readable for humans.
  • For each data frame we added an index column. This index will allow us to select a row from the data frame. Note that the index is a string. In general using strings as index is safer than using an index number. We see much earlier that things are wrong when making a mistake like using \(j\) (element) instead of \(i\) (raw material).
  • Python Pandas allows duplicate indices. We can check for this using duplicated() function.
  • Because we access the data by name, it would not matter if the rows or columns are in a different position. This is more like a database table, where we assume no particular ordering. 
  • We also use a data frame for reporting. Data frames are printed in a nicer way than Python arrays, and they can be exported to CVS files or spreadsheet with one function call.
  • The variables are also indexed by names. This is accomplished by lp.LpVariable.dicts(). This is safer than using a standard array of variables.
  • AFAIK, PuLP can only handle a single scalar bound in the LpVariable statement (e.g. all lower bounds for a variables are zero). This means: we have to specify a number of bounds as explicit singleton constraints or use a var.bounds statement. 

The results look like:


----- Data-------
      Element  Min   Max
ID                      
C      Carbon  2.0  3.00
Cu     Copper  0.4  0.60
Mn  Manganese  1.2  1.65
             Alloy    C    Cu   Mn  Stock  Price
ID                                              
A       Iron alloy  2.5   0.0  1.3   4000   1.20
B       Iron alloy  3.0   0.0  0.8   3000   1.50
C       Iron alloy  0.0   0.3  0.0   6000   0.90
D     Copper alloy  0.0  90.0  0.0   5000   1.30
E     Copper alloy  0.0  96.0  4.0   2000   1.45
F   Aluminum alloy  0.0   0.4  1.2   3000   1.20
G   Aluminum alloy  0.0   0.6  0.0   2500   1.00
----- Indices-------
Index(['A', 'B', 'C', 'D', 'E', 'F', 'G'], dtype='object', name='ID')
Index(['C', 'Cu', 'Mn'], dtype='object', name='ID')
----- Model Results-------
Status: Optimal
Objective: 5887.57427835
  Variable Index  Lower        Value    Upper
0      use     A    0.0  4000.000000  4000.00
1      use     B    0.0     0.000000  3000.00
2      use     C    0.0   397.763020  6000.00
3      use     D    0.0     0.000000  5000.00
4      use     E    0.0    27.612723  2000.00
5      use     F    0.0   574.624260  3000.00
6      use     G    0.0     0.000000  2500.00
7  content     C    2.0     2.000000     3.00
8  content    Cu    0.4     0.600000     0.60
9  content    Mn    1.2     1.200000     1.65


Safety


We will get an error if we misspell things. E.g. if we use in the second table Mgn instead of Mn, we will see:


KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Mn'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
9 frames
<ipython-input-3-d87ed8f34980> in <module>()
     59 # final content of elements and their bounds
     60 for j in Elements:
---> 61   model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])
     62   model += content[j] >= requirements.loc[j,'Min']
     63   model += content[j] <= requirements.loc[j,'Max']

<ipython-input-3-d87ed8f34980> in <listcomp>(.0)
     59 # final content of elements and their bounds
     60 for j in Elements:
---> 61   model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])
     62   model += content[j] >= requirements.loc[j,'Min']
     63   model += content[j] <= requirements.loc[j,'Max']

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __getitem__(self, key)
   1416                 except (KeyError, IndexError, AttributeError):
   1417                     pass
-> 1418             return self._getitem_tuple(key)
   1419         else:
   1420             # we by definition only have the 0th axis

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_tuple(self, tup)
    803     def _getitem_tuple(self, tup):
    804         try:
--> 805             return self._getitem_lowerdim(tup)
    806         except IndexingError:
    807             pass

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_lowerdim(self, tup)
    959                     return section
    960                 # This is an elided recursive call to iloc/loc/etc'
--> 961                 return getattr(section, self.name)[new_key]
    962 
    963         raise IndexingError("not applicable")

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __getitem__(self, key)
   1422 
   1423             maybe_callable = com.apply_if_callable(key, self.obj)
-> 1424             return self._getitem_axis(maybe_callable, axis=axis)
   1425 
   1426     def _is_scalar_access(self, key: Tuple):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1848         # fall thru to straight lookup
   1849         self._validate_key(key, axis)
-> 1850         return self._get_label(key, axis=axis)
   1851 
   1852 

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _get_label(self, label, axis)
    154             # but will fail when the index is not present
    155             # see GH5667
--> 156             return self.obj._xs(label, axis=axis)
    157         elif isinstance(label, tuple) and isinstance(label[axis], slice):
    158             raise IndexingError("no slices here, handle elsewhere")

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in xs(self, key, axis, level, drop_level)
   3735             loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
   3736         else:
-> 3737             loc = self.index.get_loc(key)
   3738 
   3739             if isinstance(loc, np.ndarray):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2897                 return self._engine.get_loc(key)
   2898             except KeyError:
-> 2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2900         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2901         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Mn'


Pulp is not giving a very well-formed error message here (not sure if Pulp can actually do that -- Python is issuing this before Pulp sees what is happening). But at least we are alerted (rather heavy-handedly) there is something wrong when we use Mn.  Careful inspection of the stack frame shows we have a problem in the constraint

 model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])

This error is actually generating an exception inside an exception handler!

Of course a much better and simpler error message would be: "supplyData.loc["A","Mn"]: Column "Mn" not found in data frame supplyData." IMHO programmers do not pay enough attention to provide meaningful error messages.

Solver log


The default solver is CBC via a DLL call. I don't think it is possible to see the solver log using this setup. I prefer to see the solver log, just to make sure there are no surprises. For this I used:

model.solve(lp.COIN_CMD(msg=1))

This will call the CBC executable (via an MPS file) and it will show the CBC log:


Welcome to the CBC MILP Solver
Version: 2.9
Build Date: Jan  6 2019

command line - cbc.exe fbbd73baa2494109b8e990ce26eb79b6-pulp.mps branch printingOptions all solution fbbd73baa2494109b8e990ce26eb79b6-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 22 COLUMNS
At line 64 RHS
At line 82 BOUNDS
At line 83 ENDATA
Problem MODEL has 17 rows, 10 columns and 34 elements
Coin0008I MODEL read with 0 errors
Presolve 4 (-13) rows, 7 (-3) columns and 18 (-16) elements
0  Obj 479.87991 Primal inf 10199.217 (4)
3  Obj 5887.5743
Optimal - objective value 5887.5743
After Postsolve, objective 5887.5743, infeasibilities - dual 1275.5592 (2), primal 0 (0)
Presolved model was optimal, full model needs cleaning up
Optimal - objective value 5887.5743
Optimal objective 5887.574275 - 3 iterations time 0.012, Presolve 0.00
Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.03   (Wallclock seconds):       0.03


The solver log shows that the presolver removes 13 of the 17 rows. This high reduction rate is related to the singleton constraints. When we look at the model we generated 3+3+7=13 bound constraints. The presolver is getting rid of these and makes proper bounds of them.

Note that: msg=1 may not always work when running as a Jupyter notebook.

Debugging 


For debugging PuLP models, I recommend:

  1. print(model). Printing the model shows how PuLP interpreted the constraints.
  2. Writing an LP file: model.writeLP("steel.lp").

The output of print(model) is:


Steel:
MINIMIZE
1.2*Use_A + 1.5*Use_B + 0.9*Use_C + 1.3*Use_D + 1.45*Use_E + 1.2*Use_F + 1.0*Use_G + 0.0
SUBJECT TO
_C1: Use_A <= 4000

_C2: Use_B <= 3000

_C3: Use_C <= 6000

_C4: Use_D <= 5000

_C5: Use_E <= 2000

_C6: Use_F <= 3000

_C7: Use_G <= 2500

_C8: 5000 Content_C - 2.5 Use_A - 3 Use_B = 0

_C9: Content_C >= 2

_C10: Content_C <= 3

_C11: 5000 Content_Cu - 0.3 Use_C - 90 Use_D - 96 Use_E - 0.4 Use_F
 - 0.6 Use_G = 0

_C12: Content_Cu >= 0.4

_C13: Content_Cu <= 0.6

_C14: 5000 Content_Mn - 1.3 Use_A - 0.8 Use_B - 4 Use_E - 1.2 Use_F = 0

_C15: Content_Mn >= 1.2

_C16: Content_Mn <= 1.65

_C17: Use_A + Use_B + Use_C + Use_D + Use_E + Use_F + Use_G = 5000

VARIABLES
Content_C Continuous
Content_Cu Continuous
Content_Mn Continuous
Use_A Continuous
Use_B Continuous
Use_C Continuous
Use_D Continuous
Use_E Continuous
Use_F Continuous
Use_G Continuous


The LP file looks like:


\* Steel *\
Minimize
OBJ: 1.2 Use_A + 1.5 Use_B + 0.9 Use_C + 1.3 Use_D + 1.45 Use_E + 1.2 Use_F
 + Use_G
Subject To
_C1: Use_A <= 4000
_C10: Content_C <= 3
_C11: 5000 Content_Cu - 0.3 Use_C - 90 Use_D - 96 Use_E - 0.4 Use_F
 - 0.6 Use_G = 0
_C12: Content_Cu >= 0.4
_C13: Content_Cu <= 0.6
_C14: 5000 Content_Mn - 1.3 Use_A - 0.8 Use_B - 4 Use_E - 1.2 Use_F = 0
_C15: Content_Mn >= 1.2
_C16: Content_Mn <= 1.65
_C17: Use_A + Use_B + Use_C + Use_D + Use_E + Use_F + Use_G = 5000
_C2: Use_B <= 3000
_C3: Use_C <= 6000
_C4: Use_D <= 5000
_C5: Use_E <= 2000
_C6: Use_F <= 3000
_C7: Use_G <= 2500
_C8: 5000 Content_C - 2.5 Use_A - 3 Use_B = 0
_C9: Content_C >= 2
End

The information is basically the same, but the ordering of the rows is a bit different.

Comparison to CVXPY


We can try to model and solve the same problem using CVXPY. CVXPY is matrix oriented, so very different than PuLP. Here is my attempt:


from io import StringIO
import pandas as pd
import numpy as np
import cvxpy as cp

# for inputting tabular data below
def table(s):
  return pd.read_csv(StringIO(s),sep='\s+',index_col='ID')

#------------------------------------------------------------------
# data
#------------------------------------------------------------------

demand = 5000

requirements = table("""
   ID  Element      Min   Max
   C   Carbon       2     3
   Cu  Copper       0.4   0.6
   Mn  Manganese    1.2   1.65
    """)

supplyData = table("""
  ID  Alloy             C       Cu     Mn     Stock   Price
  A   "Iron alloy"      2.50    0.00   1.30   4000    1.20
  B   "Iron alloy"      3.00    0.00   0.80   3000    1.50
  C   "Iron alloy"      0.00    0.30   0.00   6000    0.90
  D   "Copper alloy"    0.00   90.00   0.00   5000    1.30
  E   "Copper alloy"    0.00   96.00   4.00   2000    1.45
  F   "Aluminum alloy"  0.00    0.40   1.20   3000    1.20
  G   "Aluminum alloy"  0.00    0.60   0.00   2500    1.00
  """)

print("----- Data-------")
print(requirements)
print(supplyData)

#------------------------------------------------------------------
# derived data
#------------------------------------------------------------------

# our sets are stockItems ["A","B",..] and elements ["C","Cu",...] 
Items = supplyData.index
Elements = requirements.index

# extract arrays (make sure order is identical)
Min = requirements.loc[Elements,"Min"]
Max = requirements.loc[Elements,"Max"]
Cost = supplyData.loc[Items,"Price"]
Avail = supplyData.loc[Items,"Stock"]
Element = supplyData.loc[Items,Elements]

# counts
NumItems = np.shape(Items)[0]
NumElements  = np.shape(Elements)[0]

# reshape into proper Numpy column vectors to make cvxpy happy
Min = np.reshape(Min.to_numpy(),(NumElements,1))
Max = np.reshape(Max.to_numpy(),(NumElements,1))
Cost = np.reshape(Cost.to_numpy(),(NumItems,1))
Avail = np.reshape(Avail.to_numpy(),(NumItems,1))
Element = Element.to_numpy()

#------------------------------------------------------------------
# LP Model
#------------------------------------------------------------------

use = cp.Variable((NumItems,1),"Use",nonneg=True)
content = cp.Variable((NumElements,1),"Content",nonneg=True)

model = cp.Problem(cp.Minimize(Cost.T @ use),
                   [cp.sum(use) == demand,
                    cp.multiply(demand,content) == Element.T @ use,  
                    content >= Min,
                    content <= Max,
                    use <= Avail                    
                   ])

#------------------------------------------------------------------
# Solve and reporting
#------------------------------------------------------------------

model.solve(solver=cp.ECOS,verbose=True)

print("----- Model Results-------")
print("status:",model.status)
print("objective:",model.value)
results = pd.DataFrame({'variable':'use', 
                        'index': Items, 
                        'lower':0, 
                        'level':use.value.flatten(),
                        'upper':Avail.flatten()
                        })
results = results.append(pd.DataFrame({'variable':'content', 
                        'index': Elements, 
                        'lower':Min.flatten(), 
                        'level':content.value.flatten(),
                        'upper':Max.flatten()
                        }))
print(results)



Notes:

  • I did my best to make sure that the ordering of rows and columns in the data frames is not significant. 
  • We convert the information in the data frames to standard NumPy arrays for the benefit of CVXPY. (A column in a dataframe is a pandas series).
  • If we don't do proper shaping of the arrays, we may see error messages like: ValueError: Cannot broadcast dimensions  (3,) (3, 1)
  • The model is compact, but we needed to put more effort in data extraction. In optimization, it is not at all unusual that data stuff takes more effort than the model equations.

The results look like:


----- Data-------
      Element  Min   Max
ID                      
C      Carbon  2.0  3.00
Cu     Copper  0.4  0.60
Mn  Manganese  1.2  1.65
             Alloy    C    Cu   Mn  Stock  Price
ID                                              
A       Iron alloy  2.5   0.0  1.3   4000   1.20
B       Iron alloy  3.0   0.0  0.8   3000   1.50
C       Iron alloy  0.0   0.3  0.0   6000   0.90
D     Copper alloy  0.0  90.0  0.0   5000   1.30
E     Copper alloy  0.0  96.0  4.0   2000   1.45
F   Aluminum alloy  0.0   0.4  1.2   3000   1.20
G   Aluminum alloy  0.0   0.6  0.0   2500   1.00

ECOS 2.0.7 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS

It     pcost       dcost      gap   pres   dres    k/t    mu     step   sigma     IR    |   BT
 0  +6.293e+03  -4.362e+04  +9e+04  1e-01  8e-02  1e+00  4e+03    ---    ---    1  1  - |  -  - 
 1  +5.462e+03  -6.110e+04  +7e+04  2e-01  5e-02  2e+03  3e+03  0.5361  8e-01   0  0  0 |  0  0
 2  +5.497e+03  +1.418e+03  +7e+03  1e-02  4e-03  5e+02  3e+02  0.9313  3e-02   0  0  0 |  0  0
 3  +4.981e+03  +3.654e+03  +2e+03  4e-03  1e-03  2e+02  1e+02  0.6947  5e-02   0  0  0 |  0  0
 4  +5.687e+03  +3.974e+03  +2e+03  7e-03  9e-04  3e+02  9e+01  0.4022  7e-01   0  0  0 |  0  0
 5  +5.653e+03  +5.326e+03  +5e+02  1e-03  2e-04  2e+01  2e+01  0.9127  1e-01   0  0  0 |  0  0
 6  +5.692e+03  +5.535e+03  +2e+02  5e-04  8e-05  1e+01  1e+01  0.5874  1e-01   0  0  0 |  0  0
 7  +5.791e+03  +5.642e+03  +2e+02  6e-04  5e-05  2e+01  7e+00  0.7361  5e-01   0  0  0 |  0  0
 8  +5.843e+03  +5.798e+03  +6e+01  2e-04  2e-05  5e+00  2e+00  0.9890  4e-01   0  0  0 |  0  0
 9  +5.886e+03  +5.883e+03  +4e+00  1e-05  1e-06  3e-01  2e-01  0.9454  1e-02   0  0  0 |  0  0
10  +5.888e+03  +5.888e+03  +5e-02  2e-07  3e-08  4e-03  2e-03  0.9890  2e-03   0  0  0 |  0  0
11  +5.888e+03  +5.888e+03  +6e-04  2e-09  5e-10  5e-05  2e-05  0.9890  1e-04   1  0  0 |  0  0
12  +5.888e+03  +5.888e+03  +6e-06  2e-11  6e-12  5e-07  3e-07  0.9890  1e-04   1  0  0 |  0  0

OPTIMAL (within feastol=1.9e-11, reltol=1.1e-09, abstol=6.3e-06).
Runtime: 0.000566 seconds.

----- Model Results-------
status: optimal
objective: 5887.574272281105
  variable index  lower         level    upper
0      use     A    0.0  4.000000e+03  4000.00
1      use     B    0.0  1.283254e-06  3000.00
2      use     C    0.0  3.977630e+02  6000.00
3      use     D    0.0  5.135476e-07  5000.00
4      use     E    0.0  2.761272e+01  2000.00
5      use     F    0.0  5.746243e+02  3000.00
6      use     G    0.0  5.163966e-06  2500.00
0  content     C    2.0  2.000000e+00     3.00
1  content    Cu    0.4  5.999999e-01     0.60
2  content    Mn    1.2  1.200000e+00     1.65


The results are not rounded. That often means that the solution of an interior point algorithm looks a bit ugly. In essence this is the same solution as we found with PuLP/CBC.

Comparison to GAMS


The GAMS model is closer to PuLP:


*---------------------------------------------------
* data
*---------------------------------------------------

set
  i 'items from inventory' /A*G/
  j 'elements' /C,Cu,Mn/
;

table requirements(j,*)
         Min   Max
   C     2     3
   Cu    0.4   0.6
   Mn    1.2   1.65
;

table supplyData(i,*)
       C      Cu     Mn     Stock   Price
  A    2.5            1.3    4000    1.20
  B    3              0.8    3000    1.50
  C            0.3           6000    0.90
  D           90             5000    1.30
  E           96      4      2000    1.45
  F            0.40   1.20   3000    1.20
  G            0.60          2500    1.00
;

scalar demand /5000/;

*---------------------------------------------------
* LP Model
*---------------------------------------------------

positive variables
    use(i)       'usage of raw material'
    content(j)   'characteristics of final product'
;


* bounds
use.up(i) = supplyData(i,'Stock');
content.lo(j) = requirements(j,'Min');
content.up(j) = requirements(j,'Max');

variable totalCost 'objective variable';

equations
    obj  'objective'
    calcContent(j) 'calculate contents of final product'
    meetDemand 'total demand must be met'
;

obj.. totalCost =e= sum(i, use(i)*supplyData(i,'Price'));

calcContent(j).. demand*content(j) =e= sum(i, use(i)*supplyData(i,j));

meetDemand.. sum(i, use(i)) =e= demand;

model blending /all/;

*---------------------------------------------------
* Solve and reporting
*---------------------------------------------------
solve blending minimizing totalCost using lp;

parameter results(*,*,*);
results('use',i,'lower') = 0;
results('use',i,'level') = use.l(i);
results('use',i,'upper') = supplyData(i,'Stock');
results('content',j,'lower') = requirements(j,'Min');
results('content',j,'level') = content.l(j);
results('content',j,'upper') = requirements(j,'Max');
display results;

Notes:

  • In GAMS we start with the sets, followed by the data. With external data we can extract sets from data.
  • The rows and columns in the tables are not order dependent: we can change the order without changing the meaning of the model. 
  • We can specify bounds directly: no need for singleton constraints.
  • The statement results('use',i,'lower') = 0; is not really needed as the default value is zero. (To be more precise: all data is stored sparse, where "does not exist" is the same as zero). However, this statement forces the set element 'lower' to be before 'level' and 'upper'. Another way to enforce this, is to introduce a dummy set: set dummy /lower,level,upper/. We don't have to use the set. Just the declaration would instill an element ordering.
  • I used * in the data tables. This is conveniently disabling domain checking, but it is also dangerous. In this case, if we change in the supplyData table "Mn" to "Mng", we will not see an error. It would be safer to use full domain checking, which requires to add a few sets.
  • The safe version of the data part in GAMS would be:
    set
      i    'items from inventory' /A*G/
      d    'demand limits' /Min,Max/
      s    'supply data' /C,Cu,Mn,Stock,Price/
      j(s) 'elements' /C,Cu,Mn/
    ;
    
    table requirements(j,d)
             Min   Max
       C     2     3
       Cu    0.4   0.6
       Mn    1.2   1.65
    ;
    
    table supplyData(i,s)
           C      Cu     Mng    Stock   Price
      A    2.5            1.3    4000    1.20
      B    3              0.8    3000    1.50
      C            0.3           6000    0.90
      D           90             5000    1.30
      E           96      4      2000    1.45
      F            0.40   1.20   3000    1.20
      G            0.60          2500    1.00
    ;
    
    Here the typo "Mng" will immediately give a good error message:
      20  table supplyData(i,s)
      21         C      Cu     Mng    Stock   Price
    ****                         $170
    **** 170  Domain violation for element
    For production models it is best to use domain checking throughout.
  • IMHO, this GAMS model looks a bit more streamlined than the Python models.


The output looks like:


----     70 PARAMETER results  

                 lower       level       upper

use    .A                 4000.000    4000.000
use    .B                             3000.000
use    .C                  397.763    6000.000
use    .D                             5000.000
use    .E                   27.613    2000.000
use    .F                  574.624    3000.000
use    .G                             2500.000
content.C        2.000       2.000       3.000
content.Cu       0.400       0.600       0.600
content.Mn       1.200       1.200       1.650


References


  1. Translating a LP from Excel to Python, https://stackoverflow.com/questions/59579342/translating-a-lp-from-excel-to-python-pulp