## Thursday, June 25, 2020

### Select m items but optimize for the best k items

#### Problem Statement

This is a strange one [1]. Consider a standard objective function: $\max\>\sum_j p_j\cdot x_j$ where $$p$$ are coefficients and $$x$$ are binary decision variables. This will look at all $$n$$ objective coefficients. Now the question is:

Can I optimize just for the $$k$$ best ones?

The problem is that we don't know in advance what the best $$k$$ items will be. Just picking the largest three $$p_j$$ is not right: possibly not all of these items can be selected at the same time. So the best way is to let the model pick the $$k$$ best items. We can do this by introducing a set of binary variable $$\delta_j$$ that can have $$k$$ of them turned on. To be precise: \begin{align} \max & \sum_j \delta_j (p_j\cdot x_j)\\ & \sum_j \delta_j = k \\ & \delta_j \in \{0,1\}\end{align} We introduced extra binary decision variables $$\delta$$. But also: we made the objective non-linear.

The question is: how can we model this while keeping the model linear.

#### Example problem

Here is a small multi-dimensional knapsack-like problem:

Base MIP Model
\begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \color{darkred}x_j \in \{0,1\}\end{align}

The randomly generated data looks like:

----     16 PARAMETER p  objective coefficients

j1  0.172,    j2  0.843,    j3  0.550,    j4  0.301,    j5  0.292,    j6  0.224,    j7  0.350,    j8  0.856
j9  0.067,    j10 0.500

----     16 PARAMETER a  matrix coefficients

j1          j2          j3          j4          j5          j6          j7          j8          j9

i1       0.998       0.579       0.991       0.762       0.131       0.640       0.160       0.250       0.669
i2       0.360       0.351       0.131       0.150       0.589       0.831       0.231       0.666       0.776
i3       0.110       0.502       0.160       0.872       0.265       0.286       0.594       0.723       0.628
i4       0.413       0.118       0.314       0.047       0.339       0.182       0.646       0.561       0.770
i5       0.661       0.756       0.627       0.284       0.086       0.103       0.641       0.545       0.032

+         j10

i1       0.435
i2       0.304
i3       0.464
i4       0.298
i5       0.792

----     16 PARAMETER b  rhs values

i1 2.500,    i2 2.500,    i3 2.500,    i4 2.500,    i5 2.500

----     16 PARAMETER m                    =        5.000  number of x to select


To see the "best" $$x_j$$, I reformulated the problem to:

Base MIP Model with y variables
\begin{align} \max& \sum_j \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \color{darkred}x_j \in \{0,1\}\end{align}

When we solve this problem we see:

----     41 VARIABLE x.L  binary decision variables

j3 1.000,    j5 1.000,    j6 1.000,    j7 1.000,    j8 1.000

----     41 VARIABLE y.L  auxiliary variables (y=p*x)

j3 0.550,    j5 0.292,    j6 0.224,    j7 0.350,    j8 0.856

----     41 VARIABLE z.L                   =        2.273  objective

----     61 PARAMETER yordered  ordered version of y

k1.j8 0.856
k2.j3 0.550  <=== best three
k3.j7 0.350
k4.j5 0.292
k5.j6 0.224


The model says to select items 3,5,6,7 and 8. The three "best" items are j8 (0.856),  j3 (0.550), and j7 (0.350).

Let's optimize for the best three. Note that these are not necessarily the same items as were the best here. We don't know in advance which are the three best! We change the model to:

MIQP Model
\begin{align} \max& \sum_j \color{darkred}\delta_j \cdot \color{darkred}y_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}

This will automatically indicate the best three items by $$\delta_j=1$$ and will optimize just these three. The objective is quadratic and non-convex. So, we can solve this with a non-convex solver (or a solver smart enough to reformulate things automatically). E.g. when we feed this into Cplex we see:

Classifier predicts products in MIQP should be linearized.[2]

The results look like:

----     75 VARIABLE delta.L  three best y variables

j3  1.000,    j8  1.000,    j10 1.000

----     75 VARIABLE x.L  binary decision variables

j3  1.000,    j5  1.000,    j8  1.000,    j9  1.000,    j10 1.000

----     75 VARIABLE y.L  auxiliary variables (y=p*x)

j3  0.550,    j5  0.292,    j8  0.856,    j9  0.067,    j10 0.500

----     75 VARIABLE z.L                   =        1.907  objective

----     89 PARAMETER yordered  ordered version of y

k1.j8  0.856
k2.j3  0.550  <=== best three
k3.j10 0.500
k4.j5  0.292
k5.j9  0.067


This has changed the best three. The third item has changed from  j7 (with contribution 0.350) to j10 (0.5). The selected items become 3,5,8,9, and 10. This is different than before.

There are few linearizations we can consider:

• Create ordered variables for $$y$$ inside the model. Once we know these, just add the first three to the objective. Sorting variables inside a MIP is not completely trivial.
• Better is to linearize $$\delta_j \cdot y_j = p_j \cdot \delta_j \cdot x_j$$ directly.
• Even better is to note that $$\delta_j = 1 \Rightarrow x_j=1$$ or $$x_j\ge \delta_j$$. This is actually the only thing we need. So we end up with:

Linear MIP Model
\begin{align} \max& \sum_j \color{darkblue}p_j \cdot \color{darkred}\delta_j \\ & \color{darkred}x_j \ge \color{darkred}\delta_j \\ & \color{darkred}y_j = \color{darkblue}p_j \cdot \color{darkred}x_j \\ & \sum_j \color{darkblue}a_{i,j} \cdot \color{darkred}x_j \le \color{darkblue}b_i && \forall i \\ & \sum_j \color{darkred}x_j = \color{darkblue}m \\ & \sum_j \color{darkred}\delta_j = 3\\ & \color{darkred}x_j \in \{0,1\}\\& \color{darkred}\delta_j \in \{0,1\}\end{align}

The solution looks like:

----     52 VARIABLE x.L  binary decision variables

j3  1.000,    j5  1.000,    j8  1.000,    j9  1.000,    j10 1.000

----     52 VARIABLE delta.L  three best y variables

j3  1.000,    j8  1.000,    j10 1.000

----     52 VARIABLE y.L  auxiliary variables (y=p*x)

j3  0.550,    j5  0.292,    j8  0.856,    j9  0.067,    j10 0.500

----     52 VARIABLE z.L                   =        1.907  objective

----     72 PARAMETER yordered  ordered version of y

k1.j8  0.856
k2.j3  0.550   <=== best three
k3.j10 0.500
k4.j5  0.292
k5.j9  0.067


This is the same solution as obtained by the MIQP model.

#### Conclusion

It turns out that linearizing the problem of finding the "best three" is rather easy. A little bit surprising to me: I expected to have to do some complicated sorting. Obviously, we exploited here that the decision variables $$x_j$$ are binary variables.

#### References

1. With R and lpsolve is there a way of optimising for the best 11 elements but still pick 15 elements in total? https://stackoverflow.com/questions/62511975/with-r-and-lpsolve-is-there-a-way-of-optimising-for-the-best-11-elements-but-sti
2. Pierre Bonami, Andrea Lodi, Giulia Zarpellon, A Classifier to Decide on the Linearization of Mixed-Integer Quadratic Problems in CPLEX, http://www.optimization-online.org/DB_HTML/2020/03/7662.html