Wednesday, April 4, 2018

SOCP reformulations of a min distance problem

In [1] I discussed a geometric problem: find a point (in 2d or 3d space) as close as possible to a number of given lines. In the comments it was suggested to write this as a Second Order Cone Programming (SOCP) model. Let's give that a try.

The lines \(L_j\) are defined by two points \(p_{j,a}\) and \(p_{j,b}\).

Minimize sum of distances to set of lines in 2d


The min sum of distances model was formulated as a general non-linear programming (NLP) model:\[\begin{align}\min&\sum_j d_j\\ &d^2_j = \frac{|p_{j,a}-q|^2 |p_{j,b}-p_{j,a}|^2 - \left[(p_{j,a}-q)\cdot (p_{j,b}-p_{j,a}) \right]^2}{|p_{j,b}-p_{j,a}|^2}\end{align}\] Somewhat simplified: \[\begin{align}\min&\sum_j d_j\\ &d^2_j = \frac{|\alpha_j-q|^2 |\beta_j|^2 - \left[(\alpha_j-q)\cdot \beta_j \right]^2}{|\beta_j|^2}\end{align}\] or as an unconstrained problem: \[\min_q\>\sum_j \sqrt{\frac{|\alpha_j-q|^2 |\beta_j|^2 - \left[(p_{j,a}-q)\cdot \beta_j \right]^2}{|\beta_j|^2}}\]

Minimizing the sum of distances in SOCP models can look like:\[\begin{align}\min&\sum_j x_j\\& x_j^2 \ge \sum_i y_{j,i}^2\\& x_j \ge 0\end{align}\]

We need to shoehorn our problem into this format. Let's give this a try. Any point \(\gamma_j\) on a line \(L_j\) can be written as:\[\begin{align}\gamma_j &= (1-t_j)p_{j,a} + t_j  p_{j,b}\\ &= p_{j,a} + t_j (p_{j,b}-p_{j,a})\\ & = \alpha_{j} + t _j\beta_j\end{align}\] The distance between \(q\) and \(\gamma_j\) is obviously \(||q-\gamma_j||_2\) or \(||\delta_j||_2\) where \( \delta_j = q-\gamma_j\).

We are getting close. Let's denote the coordinates of \(\delta_j\) as \(\delta_{j,c}\) where \(c=\{x,y,z\}\) (or \(c=\{x,y\}\) for a 2D problem). We can finally write: \[\begin{align} \min_{q,d,\delta,t}& \sum_j d_j\\ & \delta_j = (\alpha_j + t_j \beta_j) - q\\ &\sum_c \delta^2_{j,c} \le d_j^2\\ &d_j \ge 0\end{align}\] This does not look so bad. We have introduced a whole slew of extra variables, but our efforts will be rewarded. We can solve this model with solvers like Cplex, Gurobi, Mosek and Xpress without resorting to a general nonlinear solver.

Unfortunately when trying this with GAMS/Cplex, I got:

CPLEX Error  5002: 'QCP_row_for_dist(line1)' is not convex.  

This is a longstanding, known bug (many years, ough!) in the standard GAMS/Cplex link. There is an alternative, experimental link called cplexd which has some of these bugs fixed: use option qcp=cplexd; to make this work.

Ciritique (a.k.a. rant)


Formulating proper SOCP models can be somewhat of a pain. We are giving up a more natural formulation just to accommodate the solver. I.e., we are moving away from the problem to be closer to the solver. That looks to me like the wrong direction. As an extra complication: the precise rules for solvers to accept SOCP models are not uniform (some solvers are more strict than others). Solvers may do some easy reformulations. E.g. convex QP models may be reformulated automatically into SOCP models behind the scenes. In general, however, it is not so easy for a software system (either the modeling system or the solver) to recognize natural NLP formulations and automatically reformulate into acceptable SOCP models. Which unfortunately means that the task of devising (sometimes non-obvious) reformulations is off-loaded to the modeler. (Actually AMPL has facilities to do a lot of these transformations [4]).

The example above is not so illustrative, but [4] has a better one: \[\min \sqrt{(x+2)^2+(y+1)^2} + \sqrt{(x+y)^2} \>\Longleftrightarrow \begin{align}\min\>&u+v\\&r^2+s^2\le u^2\\&t^2\le v^2\\& x+2=r \\&y+1=s\\&x+y=t\\ & u,v\ge 0 \end{align} \]

It would be great if modeling systems could become smarter, and can reformulate standard mathematical constructs into conic formulations. One would need access to a symbolic representation of the model. I.e., when the solver sees the model, it is probably already too late to do anything but the most basic reformulations. With the advent of more exotic features such as the exponential cone this problem will only become more important.

Likely, I am somewhat in the minority here with my concerns. I believe many developers and academics are of the opinion that SOCP is the best thing since sliced bread (may be after semi-definite programming). I suspect that the obscurity of SOCP formulations is adding some mystique to the trade. However, when I develop a model, I want to have it as mystique-less as possible. Often the main goal of a model is stated as a bridge between the problem and the algorithm (or from human to machine). May be an even more important role is as communication tool between people. Clarity and simplicity is then paramount.

Another intriguing point is that we have to unlearn things we have gotten used to when solving LPs and NLPs. Here is an example. If we know some of the optimal values of a problem, we can fix some variables ahead of time. In general this is a good approach: it will make the model smaller and simpler to solve. Especially with smart presolvers: once a variable is fixed a lot of other variables may be removed from the model (something like a cascading effect). So an interesting experiment would be:


I.e.: solve the model, fix the \(d_i\) variables to their optimal values and solve again. We would normally expect the second model to be easier to solve. Well, not so much with conic solvers: they have problems with the second "easy" solve. E.g. one can see things like:


Interior-point solution summary
  Problem status  : ILL_POSED
  Solution status : PRIMAL_ILLPOSED_CER
  Dual.    obj: 2.0486145225e-08    nrm: 2e+02    Viol.  con: 0e+00    var: 1e-13    cones: 0e+00  

Return code - 0 [MSK_RES_OK]: No error occurred.

The message No error occurred is somewhat optimistic: the solver did not return a solution.

With a different solver:


Barrier method finished in 1 seconds
Problem is infeasible after     11 iterations
Uncrunching matrix
 
   Its         Obj Value      S   Ninf  Nneg        Sum Inf  Time
     0        366.241496      B    999     0       9.758622     1
Problem is infeasible
Presolved QP is infeasible
No solution will be returned
QP is infeasible

It is noted that this happens only for some data sets. For other data these solvers get things right.

Of course, we should be happy with these new solver capabilities. Some models solve extremely fast when stated as a SOCP, and we should take whatever performance and reliability gains we can get.

References

  1. Geometry lesson: find point closest to set of lines, http://yetanothermathprogrammingconsultant.blogspot.com/2018/04/geometry-lesson-find-point-closest-to.html
  2. Mosek Modeling Cookbook, https://docs.mosek.com/MOSEKModelingCookbook-letter.pdf. This has a good overview of SOCP formulation tricks.
  3. DCP, Disciplined Convex Programming, http://dcp.stanford.edu/. Modeling tools based on DCP can help to formulate proper SOCP models. 
  4. Victor Zverovich, Robert Fourer, Automatic Reformulation of Second-Order Cone Programming Problems, https://ampl.com/MEETINGS/TALKS/2015_01_Richmond_2E.2.pdf

Appendix: An example of a picky solver


The reason Cplex complains, is that GAMS generates the following ugly model:

\ENCODING=ISO-8859-1
\Problem name: gamsmodel

Minimize
 _obj: d(line1) + d(line2) + d(line3) + z
Subject To
 _eline(line1.x)#0: 0.757256448 t(line1) + q(x) + delta(line1.x)
                     = -0.656505736
 _eline(line1.y)#1: - 1.084257608 t(line1) + q(y) + delta(line1.y)
                     = 0.686533416
 _eline(line2.x)#2: 0.115236774 t(line2) + q(x) + delta(line2.x)
                     = -0.415575766
 _eline(line2.y)#3: 1.26443496 t(line2) + q(y) + delta(line2.y)  = -0.551894266
 _eline(line3.x)#4: 1.862007808 t(line3) + q(x) + delta(line3.x)
                     = -0.865772554
 _eline(line3.y)#5: 0.157045418 t(line3) + q(y) + delta(line3.y)
                     = 0.000421337999999993
 dist(line1)#6:     - linear_part_of_dist(line1)  = 0
 dist(line2)#7:     - linear_part_of_dist(line2)  = 0
 dist(line3)#8:     - linear_part_of_dist(line3)  = 0
 obj#9:             z  = 0
 QCP_row_for_dist(line1): linear_part_of_dist(line1) + [ delta(line1.x) ^2
                          + delta(line1.y) ^2 - d(line1) ^2 ] <= 0
 QCP_row_for_dist(line2): linear_part_of_dist(line2) + [ delta(line2.x) ^2
                          + delta(line2.y) ^2 - d(line2) ^2 ] <= 0
 QCP_row_for_dist(line3): linear_part_of_dist(line3) + [ delta(line3.x) ^2
                          + delta(line3.y) ^2 - d(line3) ^2 ] <= 0
Bounds
      t(line1) Free
      t(line2) Free
      t(line3) Free
      q(x) Free
      q(y) Free
      delta(line1.x) Free
      delta(line1.y) Free
      delta(line2.x) Free
      delta(line2.y) Free
      delta(line3.x) Free
      delta(line3.y) Free
      z Free
      linear_part_of_dist(line1) Free
      linear_part_of_dist(line2) Free
      linear_part_of_dist(line3) Free
End

This model leads to CPLEX Error  5002: 'QCP_row_for_dist(line1)' is not convex. If we clean this up a bit by hand, we get:

\ENCODING=ISO-8859-1
\Problem name: gamsmodel

Minimize
 _obj: d(line1) + d(line2) + d(line3)
Subject To
 _eline(line1.x)#0: 0.757256448 t(line1) + q(x) + delta(line1.x)
                     = -0.656505736
 _eline(line1.y)#1: - 1.084257608 t(line1) + q(y) + delta(line1.y)
                     = 0.686533416
 _eline(line2.x)#2: 0.115236774 t(line2) + q(x) + delta(line2.x)
                     = -0.415575766
 _eline(line2.y)#3: 1.26443496 t(line2) + q(y) + delta(line2.y)  = -0.551894266
 _eline(line3.x)#4: 1.862007808 t(line3) + q(x) + delta(line3.x)
                     = -0.865772554
 _eline(line3.y)#5: 0.157045418 t(line3) + q(y) + delta(line3.y)
                     = 0.000421337999999993
 QCP_row_for_dist(line1): [ delta(line1.x) ^2
                          + delta(line1.y) ^2 - d(line1) ^2 ] <= 0
 QCP_row_for_dist(line2): [ delta(line2.x) ^2
                          + delta(line2.y) ^2 - d(line2) ^2 ] <= 0
 QCP_row_for_dist(line3): [ delta(line3.x) ^2
                          + delta(line3.y) ^2 - d(line3) ^2 ] <= 0
Bounds
      t(line1) Free
      t(line2) Free
      t(line3) Free
      q(x) Free
      q(y) Free
      delta(line1.x) Free
      delta(line1.y) Free
      delta(line2.x) Free
      delta(line2.y) Free
      delta(line3.x) Free
      delta(line3.y) Free
End

Cplex is happy with this model. We see: Barrier - Optimal:  Objective =  1.2652779019e-01.