#### Problem Statement

This is a strange one [1]. Consider a standard objective function: \[\max\>\sum_j p_j\cdot x_j\] where \(p\) are coefficients and \(x\) are binary decision variables. This will look at all \(n\) objective coefficients. Now the question is:

Can I optimize just for the \(k\) best ones?

The problem is that we don't know in advance what the best \(k\) items will be. Just picking the largest three \(p_j\) is not right: possibly not all of these items can be selected at the same time. So the best way is to let the model pick the \(k\) best items. We can do this by introducing a set of binary variable \(\delta_j\) that can have \(k\) of them turned on. To be precise: \[\begin{align} \max & \sum_j \delta_j (p_j\cdot x_j)\\ & \sum_j \delta_j = k \\ & \delta_j \in \{0,1\}\end{align}\] We introduced extra binary decision variables \(\delta\). But also: we made the objective non-linear.

The question is: how can we model this while keeping the model linear.

#### Example problem

Here is a small multi-dimensional knapsack-like problem:

Base MIP Model |
---|

\[ \begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \color{darkred}x_j \in \{0,1\}\end{align}\] |

The randomly generated data looks like:

---- 16 PARAMETER p objective coefficients j1 0.172, j2 0.843, j3 0.550, j4 0.301, j5 0.292, j6 0.224, j7 0.350, j8 0.856 j9 0.067, j10 0.500 ---- 16 PARAMETER a matrix coefficients j1 j2 j3 j4 j5 j6 j7 j8 j9 i1 0.998 0.579 0.991 0.762 0.131 0.640 0.160 0.250 0.669 i2 0.360 0.351 0.131 0.150 0.589 0.831 0.231 0.666 0.776 i3 0.110 0.502 0.160 0.872 0.265 0.286 0.594 0.723 0.628 i4 0.413 0.118 0.314 0.047 0.339 0.182 0.646 0.561 0.770 i5 0.661 0.756 0.627 0.284 0.086 0.103 0.641 0.545 0.032 + j10 i1 0.435 i2 0.304 i3 0.464 i4 0.298 i5 0.792 ---- 16 PARAMETER b rhs values i1 2.500, i2 2.500, i3 2.500, i4 2.500, i5 2.500 ---- 16 PARAMETER m = 5.000 number of x to select

To see the "best" \(x_j\), I reformulated the problem to:

Base MIP Model with y variables |
---|

\[ \begin{align} \max& \sum_j \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \color{darkred}x_j \in \{0,1\}\end{align}\] |

When we solve this problem we see:

---- 41 VARIABLE x.L binary decision variables j3 1.000, j5 1.000, j6 1.000, j7 1.000, j8 1.000 ---- 41 VARIABLE y.L auxiliary variables (y=p*x) j3 0.550, j5 0.292, j6 0.224, j7 0.350, j8 0.856 ---- 41 VARIABLE z.L = 2.273 objective ---- 61 PARAMETER yordered ordered version of y k1.j8 0.856 k2.j3 0.550<=== best threek3.j7 0.350 k4.j5 0.292 k5.j6 0.224

The model says to select items 3,5,6,7 and 8. The three "best" items are j8 (0.856), j3 (0.550), and j7 (0.350).

Let's optimize for the best three. Note that these are not necessarily the same items as were the best here. We don't know in advance which are the three best! We change the model to:

MIQP Model |
---|

\[ \begin{align} \max& \sum_j \color{darkred}\delta_j \cdot \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}\] |

This will automatically indicate the best three items by \(\delta_j=1\) and will optimize just these three. The objective is quadratic and non-convex. So, we can solve this with a non-convex solver (or a solver smart enough to reformulate things automatically). E.g. when we feed this into Cplex we see:

Classifier predicts products in MIQP should be linearized.[2]

The results look like:

---- 75 VARIABLE delta.L three best y variables j3 1.000, j8 1.000, j10 1.000 ---- 75 VARIABLE x.L binary decision variables j3 1.000, j5 1.000, j8 1.000, j9 1.000, j10 1.000 ---- 75 VARIABLE y.L auxiliary variables (y=p*x) j3 0.550, j5 0.292, j8 0.856, j9 0.067, j10 0.500 ---- 75 VARIABLE z.L = 1.907 objective ---- 89 PARAMETER yordered ordered version of y k1.j8 0.856 k2.j3 0.550<=== best threek3.j10 0.500 k4.j5 0.292 k5.j9 0.067

This has changed the best three. The third item has changed from j7 (with contribution 0.350) to j10 (0.5). The selected items become 3,5,8,9, and 10. This is different than before.

There are few linearizations we can consider:

- Create ordered variables for \(y\) inside the model. Once we know these, just add the first three to the objective. Sorting variables inside a MIP is not completely trivial.
- Better is to linearize \(\delta_j \cdot y_j = p_j \cdot \delta_j \cdot x_j\) directly.
- Even better is to note that \(\delta_j = 1 \Rightarrow x_j=1\) or \(x_j\ge \delta_j\). This is actually the only thing we need. So we end up with:

Linear MIP Model |
---|

\[ \begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}\delta_j \\ & \color{darkred}x_j \ge \color{darkred}\delta_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}\] |

The solution looks like:

---- 52 VARIABLE x.L binary decision variables j3 1.000, j5 1.000, j8 1.000, j9 1.000, j10 1.000 ---- 52 VARIABLE delta.L three best y variables j3 1.000, j8 1.000, j10 1.000 ---- 52 VARIABLE y.L auxiliary variables (y=p*x) j3 0.550, j5 0.292, j8 0.856, j9 0.067, j10 0.500 ---- 52 VARIABLE z.L = 1.907 objective ---- 72 PARAMETER yordered ordered version of y k1.j8 0.856 k2.j3 0.550<=== best threek3.j10 0.500 k4.j5 0.292 k5.j9 0.067

This is the same solution as obtained by the MIQP model.

#### Conclusion

It turns out that linearizing the problem of finding the "best three" is rather easy. A little bit surprising to me: I expected to have to do some complicated sorting. Obviously, we exploited here that the decision variables \(x_j\) are binary variables.

#### References

- With R and lpsolve is there a way of optimising for the best 11 elements but still pick 15 elements in total? https://stackoverflow.com/questions/62511975/with-r-and-lpsolve-is-there-a-way-of-optimising-for-the-best-11-elements-but-sti
- Pierre Bonami, Andrea Lodi, Giulia Zarpellon, A Classifier to Decide on the Linearization of Mixed-Integer Quadratic Problems in CPLEX, http://www.optimization-online.org/DB_HTML/2020/03/7662.html