Yet Another Math Programming Consultant: QuadProg: usage issues

The QuadProg QP (Quadratic Programming) solver, based on an algorithm by Goldfarb and Idnani [1,2], is quite popular. It is available for R [3] and Python [4]. However, it has some real drawbacks. Some of it is just "user interface" and documentation: how to call the solver.

The mathematical model the solver works with is [3]: \[\begin{align} \min\>& -\color{darkblue}d^T\color{darkred}b + \frac{1}{2}\color{darkred}b^T\color{darkblue}D\color{darkred}b \\ & \color{darkblue}A^T\color{darkred}b \begin{bmatrix}= \\ \ge \end{bmatrix} \color{darkblue}b_0 \end{align} \]

This immediately poses some questions:

Variables are typically called \(\color{darkred}x\), and \(\color{darkblue}b\) is usually a constant. Why the different notation here?
I don't know. Maybe to make sure you pay attention.
How to handle bounds?
The variable type the solver supports is called "free" (i.e. unrestricted). This is different than for many LP solvers, where the default variable type is positive. For non-negative variables, you need to explicitly add the constraints \(\color{darkred}b \ge 0\). Similar for other bounds.
Why the different signs for the linear part and the quadratic part in the objective?
I don't know.
Why the constant \(1/2\) in the objective?
We see this more often in QP solvers. This is because the gradient of the quadratic term \(\color{darkred}x^T\color{darkblue}Q\color{darkred}x\) is \(2\color{darkblue}Q\color{darkred}x\). The algorithm prefers to work with \(\color{darkblue}Q\color{darkred}x\) however. So in order to stay really close to the algorithm, the user is asked to multiply all quadratic coefficients \(\color{darkblue}D_{i,j}\) by 2.
Most QP solvers call the matrix with quadratic coefficients \(\color{darkblue}Q\). Why is it called \(\color{darkblue}D\) here?
Don't know.
Does \(\color{darkblue}D\) have to be symmetric?
The documentation does not tell, but the answer is: only the diagonal and upper-triangular part of \(\color{darkblue}D\) are used. This means the matrix is assumed to be symmetric. E.g. when passing \[\color{darkblue}D = \begin{pmatrix} 1 & 0 \\ 100 & 1 \end{pmatrix}\] the solver thinks it received \[\color{darkblue}D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\]
Can \(\color{darkblue}D\) be positive semi-definite?
In QuadProg, \(\color{darkblue}D\) has to be strictly positive definite.
Defnition: A matrix \(Q\) is called positive semi-definite when \(x^TQx\ge 0\) for all \(x\). It is called positive definite when \(x^TQx\gt 0\) for all \(x \ne 0\). A positive definite matrix has strictly positive eigenvalues. I.e., the smallest eigenvalue of \(\color{darkblue}D\) must be \(\lambda_{min}\gt 0\). For a positive semi-definite matrix all eigenvalues are non-negative (i.e., zero is allowed).
The fact that QuadProg requires a strict positive definite matrix, means that all variables need to be quadratic and also that this algorithm can not solve LPs. Most QP solvers (in fact all the ones I know beyond QuadProg) only require positive semi-definite quadratic coefficient matrices. I.e., for linear variables we can use coefficients \(\color{darkblue}q_{i,j}=0\) (where \(i\) or \(j\) corresponds to a linear variable). Quadprog is stricter. The whole matrix \(\color{darkblue}D\) most be strictly positive definite. You should be aware of these restrictions. And the documentation [3] should have mentioned this.
Can I trick QuadProg to accept linear variables?
Yes, suppose we have 2 quadratic variables and 2 linear variables. Then form the \(\color{darkblue}D\) matrix as \[\color{darkblue}D = \begin{pmatrix} \color{darkblue}d_{1,1} &\color{darkblue}d_{1,2} & 0 & 0 \\ \color{darkblue}d_{2,1} &\color{darkblue}d_{2,2} & 0 & 0 \\ 0 & 0 &\color{darkblue} \varepsilon & 0 \\ 0 & 0 & 0 &\color{darkblue} \varepsilon \end{pmatrix}\] where \(\color{darkblue} \varepsilon\gt 0\) is a small constant. This trick would also allow us to feed the algorithm linear problems.
Usually, solvers express linear constraints as \(\color{darkblue}A\color{darkred}x=\color{darkblue}b\). Why do we have a transposed matrix \(\color{darkblue}A^T\) here?
I suspect this is closer to what the dual algorithm prefers. Again, making the input closer to what the solver wants.
How to input \(\le\) constraints?
Multiply these by \(-1\) and make them \(\ge\) constraints.

These are some of the questions I have seen in the past about QuadProg. Hopefully, this at least demystifies some of them.

As an alternative to QuadProg, you may want to have a look at CVXR or CVXPY for modeling QPs. These tools can talk to different solvers and will apply the necessary transformations to make the model suited for the selected solver. The modeling is also a bit more natural than the matrix input for QuadProg. For non-trivial quadratic programming models, QuadProg may not be the best choice.

References

D. Goldfarb and A. Idnani (1982). Dual and Primal-Dual Methods for Solving Strictly Convex Quadratic Programs. In J. P. Hennart (ed.), Numerical Analysis, Springer-Verlag, Berlin, pages 226–239.
D. Goldfarb and A. Idnani (1983). A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming, 27, 1–33.
R package quadprog, https://cran.r-project.org/web/packages/quadprog/quadprog.pdf
Python package quadprog, https://pypi.org/project/quadprog/

Yet Another Math Programming Consultant

Monday, November 30, 2020

QuadProg: usage issues

References

No comments:

Post a Comment