Yet Another Math Programming Consultant: Quadratic Programming with Binary Variables

Quadratic Programming models where the quadratic terms involve only binary variables are interesting from a modeling point view: we can apply different reformulations.

Let's have a look at the basic model:

0-1 Unconstrained Non-convex Quadratic Programming
\[\begin{align}\min\>& \color{DarkRed}x^{T} \color{DarkBlue}Q \color{DarkRed}x + \color{DarkBlue} c^{T}\color{DarkRed}x\\ & \color{DarkRed}x_i \in \{0,1\}\end{align} \]

Only if the matrix $Q$ is positive definite we have a convex problem. So, in general, the above problem is non-convex. To keep things simple, I have no constraints and no additional continuous variables (adding those does not not really change the story).

Test data

To play a bit a with this model, I generated random data:

Q is about 25% dense (i.e. about 75% of the entries $q_{i,j}$ are zero). The nonzero entries are drawn from a uniform distribution between -100 and 100.
The linear coefficients are uniformly distributed $c_i \sim U(-100,100)$.
The size of the model is: $n=75$ (i.e. 75 binary variables). This is relative small, so the hope is we can solve this problem quickly. As we shall see the results will be very mixed.

Local MINLP solvers

Many local MINLP solvers tolerate non-convex problems, but they will not produce a global optimum. So we see:

Solver	Objective	Time	Notes
SBB	-7558.6235	0.5	Local optimum
Knitro	-7714.5721	0.4	Id.
Bonmin	-7626.7975	1.3	Id.

All solvers used default settings and timings are in seconds. It is not surprising that these local solvers find different local optima. For all solvers, the relaxed solution was almost integer and just a few nodes were needed to produce an integer solution. This looks promising. Unfortunately, we need to contain our optimism.

Global MINLP Solvers

Global MINLP solvers are in theory well-equipped to solve this model. Unfortunately, they are usually quite slow. For this example, we see a very wide performance range:

Solver	Objective	Time	Notes
Baron	-7760.1771	82
Couenne	-7646.5987	>3600	Time limit, gap 25%
Antigone	-7760.1771	252

Couenne is struggling with this model. Baron and Antigone are doing quite good on this model. We can further observe that the local solvers did not find the global optimal solution.

MIQP Solvers

If we just use an MIQP solver, we may get different results, depending on the solvers. If the solver expects a convex model, it will refuse to solve the model. Other solvers may use some automatic reformulation. Let's try a few:

Solver	Objective	Time	Notes
Mosek			Q not positive definite
Cplex	-7760.1771	27	Automatically reformulated to a MIP
Gurobi	-7760.1760	>9999	Time limit, gap 37% (Gurobi 8.0)

Most solvers have options to influence what reformulations are applied. Here we ran with default settings. MIQP solvers tend to have many options, including those that influence automatic reformulations. I just used defaults, assuming "the solver knows best what to do".

The global MINLP solvers Baron and Antigone did not do bad at all. It is noted that Gurobi 8.1 has better MIQP performance [2] (hopefully it does much better than what we see here). It is noted that we can force Gurobi to linearize the MIQP model using the solver option preqlinearize 1, and in that case it solves fast.

Perturb Diagonal

For borderline non-convex models, it is not unusual to see messages from a quadratic solver that the diagonal of $Q$ has been perturbed to make the problem convex. Here we do the same thing in the extreme [1].

Background: a matrix $Q$ is positive definite (positive semi-definite) if all eigenvalues $\lambda_i \gt 0$ ($\lambda_i\ge 0$). If there are negative eigenvalues, we can conclude $\min x^TQx$ is a non-convex problem. From this we see that the sign of the smallest eigenvalue $\lambda_{min}$ plays an important role.

To calculate the smallest eigenvalue we first have to make $Q$ symmetric (otherwise we would get complex eigenvalues). This can easily be done by replacing $Q$ by $0.5(Q^T+Q)$. This operation will not change the values of the quadratic form $x^TQx$.

If after calculating the smallest eigenvalue $\lambda_{min}$, we observe $\lambda_{min} \lt 0$, we can form \[\widetilde{Q} = Q - \lambda_{min} I \] Note that we actually add a positive number to the diagonal as $\lambda_{min}\lt 0$. To compensate we need to add to the objective a linear term of the form \[\sum_i \lambda_{min} x_i^2 = \sum_i \lambda_{min} x_i\] (for binary variables we have $x_i^2=x_i$). With this trick, we made the problem convex.

For our data set we have $\lambda_{min} = -353.710$. To make sure we are becoming convex, I added a very generous tolerance: $\lambda_{min}-1$. So I used: $\widetilde{Q} = Q - (\lambda_{min}-1) I $.

Convexified Model
\[\begin{align}\min\>& \color{DarkRed} x^T \left( \color{DarkBlue} Q - (\lambda_{min}-1) I \right) \color{DarkRed} x + \left(\color{DarkBlue} c + (\lambda_{min}-1) \right)^T \color{DarkRed} x \\ & \color{DarkRed}x_i \in \{0,1\}\end{align} \]

With this reformulation we obtained a convex MIQP. This means for instance that a solver like Mosek is back in play, and that local solvers will produce global optimal solutions. Let's try:

Solver	Objective	Time	Notes
Mosek	-7760.1771	725
Knitro	-7760.1771	2724	Node limit, gap: 3%
Bonmin	-7760.1771	>3600	Time limit, gap: 6%

These results are a little bit slower than I expected, especially when comparing to the performance of the global solvers Baron and Antigone. These results are also much slower than the first experiment with local solvers where we found integer feasible local solutions very fast.

Note. We could have started by removing all diagonal elements from $Q$ and moving them into $c$. This is again based on the fact that $x_i^2 = x_i$. I did not do this step in this experiment.

Linearization

We already saw that some solvers (such as Cplex) apply a linearization automatically. Of course we can do this ourselves.

The first thing we can do to help things along is to make $Q$ a triangular matrix. We can do this by: \[\tilde{q}_{i,j} = \begin{cases} q_{i,j}+q_{j,i} & \text{if $i \lt j$} \\ q_{i,j} & \text{if $i=j$}\\ 0 & \text{if $i \gt j$}\end{cases}\]

The next thing to do is to introduce variables $y_{i,j} = x_i x_j$. This binary multiplication can be linearized easily: \[\begin{align} & y_{i,j} \le x_i \\ & y_{i,j} \le x_j \\ & y_{i,j} \ge x_i + x_j -1 \\ & 0 \le y_{i,j} \le 1 \end{align}\] In the actual model, we can skip a few of these inequalities by observing in which directions the objective pushes variables $y_{i,j}$ (see [1]).

Linearized Model
\[\begin{align} \min\>& \sum_{i,j\|i\lt j} \color{DarkBlue}{\tilde{q}}_{i,j} \color{DarkRed} y_{i,j} + \sum_i \left( \color{DarkBlue} {\tilde{q}}_{i,i} + \color{DarkBlue} c_i \right) \color{DarkRed} x_i \\ & \color{DarkRed}y_{i,j} \le \color{DarkRed}x_i && \forall i\lt j, \color{DarkBlue} {\tilde{q}}_{i,j} \lt 0 \\ & \color{DarkRed}y_{i,j} \le \color{DarkRed}x_j && \forall i\lt j, \color{DarkBlue} {\tilde{q}}_{i,j} \lt 0 \\ & \color{DarkRed}y_{i,j} \ge \color{DarkRed}x_i +\color{DarkRed}x_j -1 && \forall i\lt j, \color{DarkBlue} {\tilde{q}}_{i,j} \gt 0 \\ & 0 \le \color{DarkRed}y_{i,j} \le 1 && \forall i\lt j, \color{DarkBlue} {\tilde{q}}_{i,j} \ne 0 \\ & \color{DarkRed}x_i \in \{0,1\} \\ \end{align} \]

Linearized Model

\[\begin{align}
\min\>& \sum_{i,j|i\lt j} \color{DarkBlue}{\tilde{q}}_{i,j} \color{DarkRed} y_{i,j} + \sum_i \left( \color{DarkBlue} {\tilde{q}}_{i,i} + \color{DarkBlue} c_i \right) \color{DarkRed} x_i \\
& \color{DarkRed}y_{i,j} \le \color{DarkRed}x_i && \forall i\lt j, \color{DarkBlue} {\tilde{q}}_{i,j} \lt 0 \\
& \color{DarkRed}y_{i,j} \le \color{DarkRed}x_j && \forall i\lt j, \color{DarkBlue} {\tilde{q}}_{i,j} \lt 0 \\
& \color{DarkRed}y_{i,j} \ge \color{DarkRed}x_i +\color{DarkRed}x_j -1 && \forall i\lt j, \color{DarkBlue} {\tilde{q}}_{i,j} \gt 0 \\

& 0 \le \color{DarkRed}y_{i,j} \le 1 && \forall i\lt j, \color{DarkBlue} {\tilde{q}}_{i,j} \ne 0 \\
& \color{DarkRed}x_i \in \{0,1\} \\ \end{align} \]

This model does not care whether the original problem is convex or not. Let's see how this works:

Solver	Objective	Time	Notes
Cplex	-7760.1771	41
CBC	-7760.1771	6488

It is known this MIP is not so easy to solve. A commercial MIP solver may be required to get good solution times. Here we see that Cplex (commercial) is doing much better than CBC (open source).

Conclusion

The problem under consideration: an unconstrained MIQP with just $n=75$ binary variables, is not that easy to solve. The overall winning strategy is to use a commercial MIP solver against a manually or automatically reformulated MIP model. Solving the MIQP directly is just very difficult for many solvers. The global solver Baron does a surprisingly good job. It is noted that if the data or the problem size changes, these performance figures may shift (a lot).

Update

An earlier version of this post had a much slower performance for Cplex MIQP. When rerunning this, I could not reproduce this, so this must have been a note taking error on my side (I suspect I was comparing with a result with $n=100$). Now, Cplex MIQP and Cplex MIP on the manually reformulated model perform comparable. My faith in Cplex automatic reformulation is fully restored (and my faith in my note taking skills further reduced). Apologies for this.

References

Billionnet, A. and Elloumi, S., Using a mixed integer quadratic programming solver for the unconstrained quadratic 0-1 problem. Math. Program. 109 (2007) pp. 55–68
http://yetanothermathprogrammingconsultant.blogspot.com/2018/10/gurobi-81.html

Yet Another Math Programming Consultant

Saturday, November 10, 2018

Quadratic Programming with Binary Variables