Yet Another Math Programming Consultant: Nonlinear variant of a knapsack problem

In [1] a problem is posed:

Original Problem
\[\begin{align}\max & \sum_i \log_{100}(\color{darkblue}w_i) \cdot \color{darkred}x_i \\ & \frac{\sum_i \color{darkblue}w_i \color{darkred}x_i}{\sqrt{\color{darkred}k}} \le 10,000\\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \color{darkred}x_i \in \{0,1\} \end{align}\]

The data vector \(w_i\) is assumed to integer valued with \(w_i\ge 1\). Hence the logarithm can be evaluated without a problem. Also we can assume that the number of items can be around 1,000.

I don't think I have ever seen \(\log_{100}()\) being used in a model. Most programming (and modeling) languages only support natural logarithms \(\ln()\) and may be \(\log_{10}()\). We can convert things by: \[\log_{100}(x) = \frac{\ln(x)}{\ln(100)}\] This means the objective can be written as \[\max \frac{1}{\ln(100)} \sum_i \ln(w_i) x_i\] Essentially, the \(\log_{100}()\) function just adds a scaling factor. We can simplify the objective to \[\max \sum_i \ln(w_i)x_i\] (The objective value will be different, but the optimal solution will be the same). Note that this is a linear objective.

log100(x) vs ln(x)

In this post I'll discuss a few different formulations:

As as MINLP (Mixed Integer Nonlinear Programming) model. This is an obvious choice considering the square root in the model.
An MIQCP (Mixed Integer Quadratically Constrained Programming) model. This (convex) quadratic reformulation allows us to use alternative solvers.
A linear MIP model. This formulation may not be immediately obvious. Of course with this model we can use widely available MIP solvers.

MINLP Formulation

The constraint can be rewritten as: \[\sum_i w_i x_i \le 10,000 \sqrt{k}\] If we ignore the all zero solution, we can assume \(k\ge 1\). This bound will make sure the square root function is always differentiable. With this, we have a standard MINLP model.

MINLP Model
\[\begin{align}\max & \sum_i \ln(\color{darkblue}w_i) \color{darkred}x_i \\ & \sum_i \color{darkblue}w_i \color{darkred}x_i \le 10,000\sqrt{\color{darkred}k}\\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \color{darkred}x_i \in \{0,1\} \\ & \color{darkred} k \ge 1 \end{align}\]

This model has only a single, well behaved, non-linearity. Literally, there is only one nonlinear nonzero element. In addition, the model is convex. So we don't expect any problems. For \(w_i\), I generated 1000 random integer values from the interval \([1,10000]\). All timings are in seconds.

MINLP Results
Solver	Obj	Time	Notes
Dicopt	1057.1355	0.6	3 NLP, 2 MIP subproblems
SBB	1056.9361	95	Node limit exceeded
Bonmin	1056.9472	600	Time limit exceeded
Bonmin	1057.1355	16	Option: bonmin.algorithm B-OA

The outer-approximation based algorithms (Dicopt, Bonmin with B-OA option) do much better than the branch & bound algorithms (SBB, default Bonmin). Even the global solvers do better:

Global Solver Results
Solver	Obj	Time
Baron	1057.1355	2
Antigone	1057.1355	1
Couenne	1057.1355	9

MIQCP Formulation

The problem can also be formulated as a convex MIQCP model:

MIQCP Model
\[\begin{align}\max & \sum_i \ln(\color{darkblue}w_i) \color{darkred}x_i \\ & \color{darkred}y^2 \le \color{darkred}k\\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \color{darkred}y = \frac{\sum_i \color{darkblue}w_i \color{darkred}x_i}{10,000} \\ & \color{darkred}x_i \in \{0,1\} \\ & \color{darkred} k \ge 1, \color{darkred}y \ge 0 \end{align}\]

Cplex has two ways to form relaxations:

MIQCPstrat 1: SOCP-based branch & bound
MIQCPstrat 2: OA-based branch & cut

The results are shown in the next section. We see again that the Outer Approximation approach works better.

MIP Formulation

Finally, we can also linearize this model by observing that \(k\in \{1,\dots,1000\}\). So we don't really have a continuous function \(f(k)=\sqrt{k}\), but rather only need function values at the integer points. We can exploit this by making this explicit. We can write: \[\begin{align} & k = \sum_i i\cdot \delta_i \\ & \sqrt{k} = \sum_i \sqrt{i}\cdot \delta_i \\ & \sum_i \delta_i = 1 \\ & \delta_i \in \{0,1\}\end{align}\] This is essentially a SOS1 (Special Ordered Set of Type 1) structure implementing a table lookup. The MIP model looks like:

MIP Model
\[\begin{align}\max & \sum_i \ln(\color{darkblue}w_i) \color{darkred}x_i \\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \sum_i \color{darkblue}w_i \color{darkred} x_i \le 10,000 \color{darkred}q \\ & \color{darkred} k = \sum_i i \cdot \color{darkred}\delta_i \\ & \color{darkred} q = \sum_i \sqrt{i} \cdot \color{darkred}\delta_i \\ & \sum_i \color{darkred} \delta_i = 1 \\ & \color{darkred}x_i, \color{darkred}\delta_i \in \{0,1\} \\ & \color{darkred} k, \color{darkred} q \ge 1 \end{align}\]

MIP Model

\[\begin{align}\max & \sum_i \ln(\color{darkblue}w_i) \color{darkred}x_i \\ & \color{darkred} k = \sum_i \color{darkred}x_i \\ & \sum_i \color{darkblue}w_i \color{darkred} x_i \le 10,000 \color{darkred}q \\ & \color{darkred} k = \sum_i i \cdot \color{darkred}\delta_i \\ & \color{darkred} q = \sum_i \sqrt{i} \cdot \color{darkred}\delta_i \\ & \sum_i \color{darkred} \delta_i = 1 \\ & \color{darkred}x_i, \color{darkred}\delta_i \in \{0,1\} \\ & \color{darkred} k, \color{darkred} q \ge 1 \end{align}\]

When we solve this problem we see:

MIQCP and MIP Results
Model	Solver	Obj	Time	Notes
MIQCP	Cplex	1057.1355	600	SOCP-based branch & bound, time limit
MIQCP	Cplex	1057.1355	0.3	OA-based branch & cut
MIP	Cplex	1057.1355	0.6

Conclusion: the question in the original post was: how to solve this problem? Here we proposed three different models: an MINLP, MIQCP and MIP model. All these models can be solved quickly. It is noted that the quadratic and linear models are not approximations: they give the same solution as the original MINLP model. Pure non-linear branch & bound methods are having a bit of a problem with the MINLP model, but Outer-Approximation works very well.

References

How do we solve a variant of the knapsack problem in which the capacity of the knapsack keeps increasing as we add more items into the knapsack?, https://stackoverflow.com/questions/59242370/how-do-we-solve-a-variant-of-the-knapsack-problem-in-which-the-capacity-of-the-k
Pierre Bonami, Andrea Tramontani, Recent improvement to MISOCP in CPLEX, Informs 2015, http://www.optimizationdirect.com/data/informs2015_bonami.pdf

Appendix: automatic reformulation with CVXPY

CVXPY accepts the MINLP model, and can automatically reformulate this into a MISOCP model:

x = cp.Variable(n,integer=True)
obj = cp.Maximize(np.log(w)*x)
con = w*x <= 10000 * cp.sqrt(cp.sum(x)) 
prob = cp.Problem(obj,[con, x >= 0, x <= 1])
prob.solve(solver=cp.ECOS_BB, max_iters=10000)
print("status:",prob.status)
print("Obj:",prob.value)

Unfortunately, the MISOCP solver that comes with CVXPY is not the most reliable one. It shows:

status: optimal_inaccurate
Obj: 1057.1085712222387
X: [-1.52640229e-10 -4.44776117e-11 -6.08671797e-11 -9.71628737e-11
 -9.95050649e-11 -1.23087763e-10 -8.63183870e-11 -4.39955343e-11
  1.00000000e+00 -6.54372650e-11 -3.95051292e-11 -5.86051991e-11
...

The reported objective is close to what we have found earlier.

Yet Another Math Programming Consultant

Tuesday, December 10, 2019

Nonlinear variant of a knapsack problem

MINLP Formulation

MIQCP Formulation

MIP Formulation

References

Appendix: automatic reformulation with CVXPY

No comments:

Post a Comment