Original Problem |
---|

\[\begin{align}\max & \sum_i \log_{100}(\color{darkblue}w_i) \cdot \color{darkred}x_i \\ & \frac{\sum_i \color{darkblue}w_i \color{darkred}x_i}{\sqrt{\color{darkred}k}} \le 10,000\\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \color{darkred}x_i \in \{0,1\} \end{align}\] |

The data vector \(w_i\) is assumed to integer valued with \(w_i\ge 1\). Hence the logarithm can be evaluated without a problem. Also we can assume that the number of items can be around 1,000.

I don't think I have ever seen \(\log_{100}()\) being used in a model. Most programming (and modeling) languages only support natural logarithms \(\ln()\) and may be \(\log_{10}()\). We can convert things by: \[\log_{100}(x) = \frac{\ln(x)}{\ln(100)}\] This means the objective can be written as \[\max \frac{1}{\ln(100)} \sum_i \ln(w_i) x_i\] Essentially, the \(\log_{100}()\) function just adds a scaling factor. We can simplify the objective to \[\max \sum_i \ln(w_i)x_i\] (The objective value will be different, but the optimal solution will be the same). Note that this is a linear objective.

In this post I'll discuss a few different formulations (MINLP, MIQCP and MIP), and report my findings.

#### MINLP Formulation

The constraint can be rewritten as: \[\sum_i w_i x_i \le 10,000 \sqrt{k}\] If we ignore the all zero solution, we can assume \(k\ge 1\). This bound will make sure the square root function is always differentiable. With this, we have a standard MINLP (Mixed Integer Nonlinear Programming) model.

MINLP Model |
---|

\[\begin{align}\max & \sum_i \ln(\color{darkblue}w_i) \color{darkred}x_i \\ & \sum_i \color{darkblue}w_i \color{darkred}x_i \le 10,000\sqrt{\color{darkred}k}\\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \color{darkred}x_i \in \{0,1\} \\ & \color{darkred} k \ge 1 \end{align}\] |

This model has only a single, well behaved, non-linearity. Literally, there is only one nonlinear nonzero element. In addition, the model is

**convex**. So we don't expect any problems. For \(w_i\), I generated 1000 random integer values from the interval \([1,10000]\).

MINLP Results | |||
---|---|---|---|

Solver | Obj | Time | Notes |

Dicopt | 1057.1355 | 0.6 | 3 NLP, 2 MIP subproblems |

SBB | 1056.9361 | 95 | Node limit exceeded |

Bonmin | 1056.9472 | 600 | Time limit exceeded |

Bonmin | 1057.1355 | 16 | Option: bonmin.algorithm B-OA |

The

**outer-approximation**based algorithms (Dicopt, Bonmin with B-OA option) do much better than the

**branch & bound**algorithms (SBB, default Bonmin). Even the global solvers do better:

Global Solver Results | |||
---|---|---|---|

Solver | Obj | Time | |

Baron | 1057.1355 | 2 | |

Antigone | 1057.1355 | 1 | |

Couenne | 1057.1355 | 9 |

#### MIQCP Formulation

The problem can also be formulated as a convex MIQCP (Mixed Integer Quadratically Constrained Programming) model:

MIQCP Model |
---|

\[\begin{align}\max & \sum_i \ln(\color{darkblue}w_i) \color{darkred}x_i \\ & \color{darkred}y^2 \le \color{darkred}k\\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \color{darkred}y = \frac{\sum_i \color{darkblue}w_i \color{darkred}x_i}{10,000} \\ & \color{darkred}x_i \in \{0,1\} \\ & \color{darkred} k \ge 1, \color{darkred}y \ge 0 \end{align}\] |

Cplex has two ways to form relaxations:

- MIQCPstrat 1: SOCP-based branch & bound
- MIQCPstrat 2: OA-based branch & cut

The results are shown in the next section. We see again that the Outer Approximation approach works better.

#### MIP Formulation

Finally, we can also linearize this model by observing that \(k\in \{1,\dots,1000\}\). So we don't really have a continuous function \(f(k)=\sqrt{k}\), but rather only need function values at the integer points. We can exploit this by making this explicit. We can write: \[\begin{align} & k = \sum_i i\cdot \delta_i \\ & \sqrt{k} = \sum_i \sqrt{i}\cdot \delta_i \\ & \sum_i \delta_i = 1 \\ & \delta_i \in \{0,1\}\end{align}\] This is essentially a SOS1 (Special Ordered Set of Type 1) structure implementing a table lookup. The MIP model looks like:

MIP Model |
---|

\[\begin{align}\max & \sum_i \ln(\color{darkblue}w_i) \color{darkred}x_i \\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \sum_i \color{darkblue}w_i \color{darkred} x_i \le 10,000 \color{darkred}q \\ & \color{darkred} k = \sum_i i \cdot \color{darkred}\delta_i \\ & \color{darkred} q = \sum_i \sqrt{i} \cdot \color{darkred}\delta_i \\ & \sum_i \color{darkred} \delta_i = 1 \\ & \color{darkred}x_i, \color{darkred}\delta_i \in \{0,1\} \\ & \color{darkred} k, \color{darkred} q \ge 1 \end{align}\] |

When we solve this problem we see:

MIQCP and MIP Results | ||||
---|---|---|---|---|

Model | Solver | Obj | Time | Notes |

MIQCP | Cplex | 1057.1355 | 600 | SOCP-based branch & bound, time limit |

MIQCP | Cplex | 1057.1355 | 0.3 | OA-based branch & cut |

MIP | Cplex | 1057.1355 | 0.6 |

Conclusion: the question in the original post was: how to solve this problem? Here we proposed three different models: an MINLP, MIQCP and MIP model. All these models can be solved quickly. It is noted that the quadratic and linear models are

**not**approximations: they give the same solution as the original MINLP model. Pure non-linear branch & bound methods are having a bit of a problem with the MINLP model, but Outer-Approximation works very well.

#### References

- How do we solve a variant of the knapsack problem in which the capacity of the knapsack keeps increasing as we add more items into the knapsack?, https://stackoverflow.com/questions/59242370/how-do-we-solve-a-variant-of-the-knapsack-problem-in-which-the-capacity-of-the-k
- Pierre Bonami, Andrea Tramontani, Recent improvement to MISOCP in CPLEX, Informs 2015, http://www.optimizationdirect.com/data/informs2015_bonami.pdf

#### Appendix: automatic reformulation with CVXPY

CVXPY accepts the MINLP model, and can automatically reformulate this into a MISOCP model:

x = cp.Variable(n,integer=True) obj = cp.Maximize(np.log(w)*x) con = w*x <= 10000 * cp.sqrt(cp.sum(x)) prob = cp.Problem(obj,[con, x >= 0, x <= 1]) prob.solve(solver=cp.ECOS_BB, max_iters=10000) print("status:",prob.status) print("Obj:",prob.value)

Unfortunately, the MISOCP solver that comes with CVXPY is not the most reliable one. It shows:

status: optimal_inaccurate Obj: 1057.1085712222387 X: [-1.52640229e-10 -4.44776117e-11 -6.08671797e-11 -9.71628737e-11 -9.95050649e-11 -1.23087763e-10 -8.63183870e-11 -4.39955343e-11 1.00000000e+00 -6.54372650e-11 -3.95051292e-11 -5.86051991e-11 ...

The reported objective is close to what we have found earlier.