Wednesday, January 29, 2020

Multi-objective optimization: min sum vs min max

The issue

In many models, we observe that an objective that minimizes a sum delivers undesirable results. Here are some examples:

Assignment under preferences

In [1], we formulated a MIP model that tackles the problem of creating groups taking into account the preferences of people. When we just maximize the total sum of preferences that are honored by the assignment, we can see that some participants get a terrible assignment. I.e., taking one for the team. It requires some thought to develop a model that produces a more equitable solution. In a sense, we are looking at trading-off overall efficiency for fairness.

Job scheduling

Typically in job scheduling models, we minimize (1) the total make-span: do things as quickly as possible and (2) the sum of tardiness of jobs. Sometimes we use a weighting scheme for tardy jobs: jobs from valuable clients may get a higher weight to make it less likely they are chosen to be tardy. There are some terms in the objective we can add, for instance:

  • The number of jobs that are tardy. The sum may distribute the pain over many jobs. If we want to reduce the number of tardy jobs we can add a term that minimizes the number of tardy jobs.
  • The maximum tardiness of a job. Adding such a term will try to prevent the situation where a single job will take the brunt of the pain.

These are examples where just minimizing the sum can lead to unpalatable solutions. It may require some effort to design a more sophisticated objective that takes into account more aspects than just overall efficiency.

Example model

In [2] I discussed a MIP formulation for a matching problem:

Given \(N=2M\) points, find a matching that minimizes the sum of the cost (represented by lengths between the matched points). 

A. Single solution

In this section, I discuss how we can look at the problem in different ways:

  1. MIN SUM: Solve the problem minimizing the sum of the cost.
  2. MIN MAX: Solve the problem minimizing the maximum cost. 
  3. MIN MAX followed by MIN SUM: this variant exploits that MIN MAX is not unique. 
  4. MIN SUM followed by MIN MAX: due to uniqueness, this is not an interesting problem.
  5. A weighted sum of MIN SUM and MIN MAX.

A.1. MIN SUM Matching 

The Min Sum problem can be stated as a MIP problem:

MIP Model for Min Sum Matching
\[\begin{align}\min& \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The optimal MIP solution looks like:

Optimal MIN SUM solution

In the solution, we see a spike: the longest length is much larger than the rest.

A.2. MIN MAX Problem

Instead of minimizing the sum, we can also minimize the length of the largest segment. Formally we want \[\min \max_{i\lt j} d_{i,j} x_{i,j}\] This can be linearized in a standard manner.

MIP Model for Min-Max Matching
\[\begin{align}\min\>& \color{darkred}z\\ & \color{darkred}z \ge \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall  i \lt j\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The results look like:

Optimal MIN MAX solution

We see we do much better than the maximum length of the MIN SUM problem.

One issue with the MIN MAX problem is that the solution is not unique. Any configuration with lengths less than the maximum length is optimal.

A.3. MIN MAX followed by MIN SUM 

Instead of just using the MIN MAX objective, we can look at the problem as a multi-objective problem:

Multi-objective Model
\[\begin{align}\min \>& \color{darkred}z_1 = \max_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ \min\>& \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

Of course we need to reformulate the \(\max\) function to make it linear. The multi-objective structure was implemented as:

  1. Solve problem with the MIN MAX objective\[\begin{align} \min\>&z_1\\ &z_1 \ge d_{i,j} x_{i,j} &&\forall i\lt j \\ & \sum_{j|j \gt i} x_{i,j} + \sum_{j|j \lt i} x_{j,i} = 1 \\ &x_{i,j} \in \{0,1\} \end{align}\] Let \(z_1^*\) be the optimal objective.
  2. Solve problem with the MIN SUM, subject to MIN MAX objective is not deteriorating (i.e. the length of each segment is less than the maximum length found in the previous step). \[\begin{align}\min\>& z_2 = \sum_{i,j|i \lt j} d_{i,j} x_{i,j} \\ & d_{i,j} x_{i,j} \le z_1^* && \forall i\lt j\\ & \sum_{j|j \gt i} x_{i,j} + \sum_{j|j \lt i} x_{j,i} = 1 && \forall i\\ & x_{i,j} \in \{0,1\}\end{align}\] 

This is typically called the lexicographic method or preemptive method.

The results look like:

Optimal multi-objective solution

We see a much better distribution of the lengths with this approach. The underlying reason is that the MIN MAX solution is not unique. There are a ton of solutions with all lengths less than the MIN-MAX objective (at least 50k: I stopped after reaching this number). With the second objective, we choose the "best" of all these solutions.

Here is a summary of the results:

MIN SUM241.267241.26738.768
MIN MAX22.954403.26522.954
MULTI-OBJECTIVE (second solve)247.382247.38222.954

We can see that:

  • The largest segment is very long for the MIN SUM model
  • The sum is very large for the MIN MAX model.
  • The MULTI model has a sum only a bit larger than from the MIN SUM model, and the largest length is identical to the MIN MAX model. It prevents the disadvantages of the MIN SUM and MIN MAX models.

A.4. MIN SUM followed by MIN MAX

The problem formed by first solving the MIN SUM problem followed by the MIN MAX model is not of much value: we get the same solution as the MIN-SUM problem. The reason is that the MIN-SUM solution is unique. That means that running the second model does not yield a different solution.

A.5. A weighted sum of MIN SUM and MIN MAX

An alternative to the preemptive or lexicographic method is to use a composite objective with some weights:

Multi-objective Model via Weighted Sum
\[\begin{align}\min \>&  \color{darkblue}w_1 \color{darkred}z_1 +  \color{darkblue}w_2 \color{darkred}z_2 \\ & \color{darkred}z_1 \ge   \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall i,j| i \lt j \\ & \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i\\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

The convention is we use \(w_i\gt 0\) and \(\sum_i w_i=1\). If we use\[\begin{cases} w_1 = 0.01 \\ w_2 = 0.99\end{cases}\] we get the same solution as in section A.3 MIN ABS followed by MIN SUM.

B. Finding All Efficient Solutions

In section A we looked at characterizing the "optimal" solution by a single point. When we have multiple objectives (min-sum and min-max), it can be insightful to looks at all "optimal" points or a representative subset. An optimal solution in this context can be defined as being non-dominated by another solution. Such a solution is called efficient. The collection of all efficient solutions is called the efficient frontier. 

B.1. Weighted sum loop

A first method is to vary the weights and solve for each weight. We don't want weights of zero (or one), so I generated the following step values:

----    135 PARAMETER step  

step0  0.010,    step1  0.050,    step2  0.100,    step3  0.150,    step4  0.200,    step5  0.250,    step6  0.300
step7  0.350,    step8  0.400,    step9  0.450,    step10 0.500,    step11 0.550,    step12 0.600,    step13 0.650
step14 0.700,    step15 0.750,    step16 0.800,    step17 0.850,    step18 0.900,    step19 0.950,    step20 0.990

If we solve our weighted sum model for these values, we get the following report:

----    147 PARAMETER front  

                w1          w2         sum         max         obj

step0        0.010       0.990     247.382      22.954      25.199
step1        0.050       0.950     247.382      22.954      34.176
step2        0.100       0.900     247.382      22.954      45.397
step3        0.150       0.850     247.382      22.954      56.619
step4        0.200       0.800     247.382      22.954      67.840
step5        0.250       0.750     247.382      22.954      79.061
step6        0.300       0.700     247.382      22.954      90.283
step7        0.350       0.650     247.382      22.954     101.504
step8        0.400       0.600     247.382      22.954     112.726
step9        0.450       0.550     247.382      22.954     123.947
step10       0.500       0.500     247.382      22.954     135.168
step11       0.550       0.450     247.382      22.954     146.390
step12       0.600       0.400     247.382      22.954     157.611
step13       0.650       0.350     247.382      22.954     168.833
step14       0.700       0.300     247.382      22.954     180.054
step15       0.750       0.250     241.267      38.768     190.642
step16       0.800       0.200     241.267      38.768     200.767
step17       0.850       0.150     241.267      38.768     210.892
step18       0.900       0.100     241.267      38.768     221.017
step19       0.950       0.050     241.267      38.768     231.142
step20       0.990       0.010     241.267      38.768     239.242

This is a bit disappointing: we only observe two different solutions. And we saw these before.

We may miss some efficient (i.e., non-dominated) solutions. In general, the weighted sum method suffers from two problems:

  1. We may miss some efficient solutions. Although each point found with the weighted sum method is non-dominated, the other way around is not true: there may be efficient solutions we cannot see with this method.
  2. If this method finds several efficient points, these points may not be nicely distributed.
It is noted that in our simple loop we can miss efficient solutions for two reasons:

  • The inherent problem of the weighted sum model: it can miss solutions.
  • The way we set up the loop: we only looked at 20 different weights. When we increase this, we may see more non-dominated solutions.

There is another method we can use.

B.2. Find non-dominated solutions

We can actively look for non-dominated solutions as follows:

  1. Start with arbitrary weights \(w_1, w_2 \in (0,1)\) to find a first solution on the efficient frontier. E.g. \(w_1 = w_2 = 0.5.\)
  2. Add to the model constraints on the objective values \(z_1, z_2\) such that a new solution is better in one of the objectives than any other found efficient solution
  3. Also, add a cut to the model that forbids the current solution.
  4. Solve. If infeasible, STOP. We are done.
  5. Go to step 2.

The model solved at each step \(K\) looks like:

MIP Model in search for non-dominated solutions
\[\begin{align}\min \>& \color{darkblue}w_1 \color{darkred}z_1 + \color{darkblue}w_2 \color{darkred}z_2 \\ & \color{darkred}z_1 \ge \color{darkblue}d_{i,j} \color{darkred}x_{i,j} && \forall i \lt j \\ & \color{darkred}z_2 = \sum_{i,j|i \lt j} \color{darkblue}d_{i,j} \color{darkred}x_{i,j}\\ & \sum_{j|j \gt i} \color{darkred}x_{i,j} + \sum_{j|j \lt i} \color{darkred}x_{j,i} = 1 && \forall i \\ & \color{darkred}z_1 \le \color{darkblue}z_1^{(k)} - \color{darkblue}\epsilon + \color{darkblue}M_1 \color{darkred}\delta && k=1,\dots,K-1\\ & \color{darkred}z_2 \le \color{darkblue}z_2^{(k)} - \color{darkblue}\epsilon + \color{darkblue}M_2 (1-\color{darkred}\delta) && k=1,\dots,K-1 \\ & \sum_{i,j|i\lt j} (2 \color{darkblue}x^{(k)}_{i,j}-1) \color{darkred}x_{i,j} \le \sum_{i,j|i\lt j} \color{darkblue}x^{(k)}_{i,j}-1 && k=1,\dots,K-1 \\ & \color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}\delta \in \{0,1\} \end{align} \]

Here \(z^{(k)}\) and \(x^{(k)}\) indicate solutions found in earlier iterations. \(\epsilon\) is a small constant to make sure we are actually improving an objective. Good values for \(M\) can be found using our earlier optimization models. E.g.: \[\begin{align} &M_1 = 38.768 - 22.945 + \epsilon \\
& M_2 = 247.382 - 241.267 + \epsilon\end{align}\] The constraints \[\begin{align} &z_1 \le  z_1^{(k)} - \epsilon + M_1 \delta\\ &z_2 \le z_2^{(k)} - \epsilon + M_2 (1-\delta)\end{align}\] say: one of the objectives should be (strictly) better compared to any other point we have found earlier. Finally, the constraint: \[\sum_{i,j|i\lt j} (2 x^{(k)}_{i,j}-1) x_{i,j} \le \sum_{i,j|i\lt j} x^{(k)}_{i,j}-1\] forbids any previously found point to be considered again.

When we run this algorithm, we find the following solutions:

----    228 PARAMETER front2  

               sum         max         obj

point1   247.38237    22.95441   135.16839
point2   243.76532    33.90486   138.83509
point3   241.26672    38.76785   140.01729

We see that point2 is a newly found efficient solution. In addition, we only need to solve 4 models (the last model, not shown in the results, is infeasible). This model is substantially more complicated than the previous loop with different weights. But we get something in return for this additional complexity: this algorithm produces all efficient points and depending on the number of non-dominated solutions, may need fewer models to solve.


In practical models, just what constitutes an optimal solution, is not always easy to define. An overall best solution may hurt some individuals disproportional. In our simple example, we saw a MIN SUM solution contained a few very bad matches. Trying to fix this with a multi-objective approach requires some thought. And this may give rise to new questions such as: give me all (or some subset) of the non-dominated solutions. This again is not an easy question.


No comments:

Post a Comment