In [1] I discussed a geometric problem: find a point (in 2d or 3d space) as close as possible to a number of given lines. In the comments it was suggested to write this as a Second Order Cone Programming (SOCP) model. Let's give that a try.
The lines \(L_j\) are defined by two points \(p_{j,a}\) and \(p_{j,b}\).
|
Minimize sum of distances to set of lines in 2d |
The
min sum of distances model was formulated as a general non-linear programming (NLP) model:\[\begin{align}\min&\sum_j d_j\\ &d^2_j = \frac{|p_{j,a}-q|^2 |p_{j,b}-p_{j,a}|^2 - \left[(p_{j,a}-q)\cdot (p_{j,b}-p_{j,a}) \right]^2}{|p_{j,b}-p_{j,a}|^2}\end{align}\] Somewhat simplified: \[\begin{align}\min&\sum_j d_j\\ &d^2_j = \frac{|\alpha_j-q|^2 |\beta_j|^2 - \left[(\alpha_j-q)\cdot \beta_j \right]^2}{|\beta_j|^2}\end{align}\] or as an unconstrained problem: \[\min_q\>\sum_j \sqrt{\frac{|\alpha_j-q|^2 |\beta_j|^2 - \left[(p_{j,a}-q)\cdot \beta_j \right]^2}{|\beta_j|^2}}\]
Minimizing the sum of distances in SOCP models can look like:\[\begin{align}\min&\sum_j x_j\\& x_j^2 \ge \sum_i y_{j,i}^2\\& x_j \ge 0\end{align}\]
We need to shoehorn our problem into this format. Let's give this a try. Any point \(\gamma_j\) on a line \(L_j\) can be written as:\[\begin{align}\gamma_j &= (1-t_j)p_{j,a} + t_j p_{j,b}\\ &= p_{j,a} + t_j (p_{j,b}-p_{j,a})\\ & = \alpha_{j} + t _j\beta_j\end{align}\] The distance between \(q\) and \(\gamma_j\) is obviously \(||q-\gamma_j||_2\) or \(||\delta_j||_2\) where \( \delta_j = q-\gamma_j\).
We are getting close. Let's denote the coordinates of \(\delta_j\) as \(\delta_{j,c}\) where \(c=\{x,y,z\}\) (or \(c=\{x,y\}\) for a 2D problem). We can finally write: \[\begin{align} \min_{q,d,\delta,t}& \sum_j d_j\\ & \delta_j = (\alpha_j + t_j \beta_j) - q\\ &\sum_c \delta^2_{j,c} \le d_j^2\\ &d_j \ge 0\end{align}\] This does not look so bad. We have introduced a whole slew of extra variables, but our efforts will be rewarded. We can solve this model with solvers like Cplex, Gurobi, Mosek and Xpress without resorting to a general nonlinear solver.
Unfortunately when trying this with GAMS/Cplex, I got:
CPLEX Error 5002: 'QCP_row_for_dist(line1)' is not convex.
This is a longstanding, known bug (many years, ough!) in the standard GAMS/Cplex link. There is an alternative, experimental link called
cplexd which has some of these bugs fixed: use
option qcp=cplexd; to make this work.
Ciritique (a.k.a. rant)
Formulating proper SOCP models can be somewhat of a pain. We are giving up a more natural formulation just to accommodate the solver. I.e., we are moving away from the problem to be closer to the solver. That looks to me like the wrong direction. As an extra complication: the precise rules for solvers to accept SOCP models are not uniform (some solvers are more strict than others). Solvers may do some easy reformulations. E.g. convex QP models may be reformulated automatically into SOCP models behind the scenes. In general, however, it is not so easy for a software system (either the modeling system or the solver) to recognize natural NLP formulations and automatically reformulate into acceptable SOCP models. Which unfortunately means that the task of devising (sometimes non-obvious) reformulations is off-loaded to the modeler. (Actually AMPL has facilities to do a lot of these transformations [4]).
The example above is not so illustrative, but [4] has a better one: \[\min \sqrt{(x+2)^2+(y+1)^2} + \sqrt{(x+y)^2} \>\Longleftrightarrow \begin{align}\min\>&u+v\\&r^2+s^2\le u^2\\&t^2\le v^2\\& x+2=r \\&y+1=s\\&x+y=t\\ & u,v\ge 0 \end{align} \]
It would be great if modeling systems could become smarter, and can reformulate standard mathematical constructs into conic formulations. One would need access to a symbolic representation of the model. I.e., when the solver sees the model, it is probably already too late to do anything but the most basic reformulations. With the advent of more exotic features such as the exponential cone this problem will only become more important.
Likely, I am somewhat in the minority here with my concerns. I believe many developers and academics are of the opinion that SOCP is the best thing since sliced bread (may be after semi-definite programming). I suspect that the obscurity of SOCP formulations is adding some mystique to the trade. However, when I develop a model, I want to have it as mystique-less as possible. Often the main goal of a model is stated as a bridge between the problem and the algorithm (or from human to machine). May be an even more important role is as communication tool between people. Clarity and simplicity is then paramount.
Another intriguing point is that we have to unlearn things we have gotten used to when solving LPs and NLPs. Here is an example. If we know some of the optimal values of a problem, we can fix some variables ahead of time. In general this is a good approach: it will make the model smaller and simpler to solve. Especially with smart presolvers: once a variable is fixed a lot of other variables may be removed from the model (something like a cascading effect). So an interesting experiment would be:
I.e.: solve the model, fix the \(d_i\) variables to their optimal values and solve again. We would normally expect the second model to be easier to solve. Well, not so much with conic solvers: they have problems with the second "easy" solve. E.g. one can see things like:
Interior-point solution summary
Problem status : ILL_POSED
Solution status : PRIMAL_ILLPOSED_CER
Dual. obj: 2.0486145225e-08 nrm: 2e+02 Viol. con: 0e+00 var: 1e-13 cones: 0e+00
Return code - 0 [MSK_RES_OK]: No error occurred.
The message
No error occurred is somewhat optimistic: the solver did not return a solution.
With a different solver:
Barrier method finished in 1 seconds
Problem is infeasible after 11 iterations
Uncrunching matrix
Its Obj Value S Ninf Nneg Sum Inf Time
0 366.241496 B 999 0 9.758622 1
Problem is infeasible
Presolved QP is infeasible
No solution will be returned
QP is infeasible
It is noted that this happens only for some data sets. For other data these solvers get things right.
Of course, we should be happy with these new solver capabilities. Some models solve extremely fast when stated as a SOCP, and we should take whatever performance and reliability gains we can get.
References
Appendix: An example of a picky solver
The reason Cplex complains, is that GAMS generates the following ugly model:
\ENCODING=ISO-8859-1
\Problem name: gamsmodel
Minimize
_obj: d(line1) + d(line2) + d(line3) + z
Subject To
_eline(line1.x)#0: 0.757256448 t(line1) + q(x) + delta(line1.x)
= -0.656505736
_eline(line1.y)#1: - 1.084257608 t(line1) + q(y) + delta(line1.y)
= 0.686533416
_eline(line2.x)#2: 0.115236774 t(line2) + q(x) + delta(line2.x)
= -0.415575766
_eline(line2.y)#3: 1.26443496 t(line2) + q(y) + delta(line2.y) = -0.551894266
_eline(line3.x)#4: 1.862007808 t(line3) + q(x) + delta(line3.x)
= -0.865772554
_eline(line3.y)#5: 0.157045418 t(line3) + q(y) + delta(line3.y)
= 0.000421337999999993
dist(line1)#6: - linear_part_of_dist(line1) = 0
dist(line2)#7: - linear_part_of_dist(line2) = 0
dist(line3)#8: - linear_part_of_dist(line3) = 0
obj#9: z = 0
QCP_row_for_dist(line1): linear_part_of_dist(line1) + [ delta(line1.x) ^2
+ delta(line1.y) ^2 - d(line1) ^2 ] <= 0
QCP_row_for_dist(line2): linear_part_of_dist(line2) + [ delta(line2.x) ^2
+ delta(line2.y) ^2 - d(line2) ^2 ] <= 0
QCP_row_for_dist(line3): linear_part_of_dist(line3) + [ delta(line3.x) ^2
+ delta(line3.y) ^2 - d(line3) ^2 ] <= 0
Bounds
t(line1) Free
t(line2) Free
t(line3) Free
q(x) Free
q(y) Free
delta(line1.x) Free
delta(line1.y) Free
delta(line2.x) Free
delta(line2.y) Free
delta(line3.x) Free
delta(line3.y) Free
z Free
linear_part_of_dist(line1) Free
linear_part_of_dist(line2) Free
linear_part_of_dist(line3) Free
End
This model leads to
CPLEX Error 5002: 'QCP_row_for_dist(line1)' is not convex. If we clean this up a bit by hand, we get:
\ENCODING=ISO-8859-1
\Problem name: gamsmodel
Minimize
_obj: d(line1) + d(line2) + d(line3)
Subject To
_eline(line1.x)#0: 0.757256448 t(line1) + q(x) + delta(line1.x)
= -0.656505736
_eline(line1.y)#1: - 1.084257608 t(line1) + q(y) + delta(line1.y)
= 0.686533416
_eline(line2.x)#2: 0.115236774 t(line2) + q(x) + delta(line2.x)
= -0.415575766
_eline(line2.y)#3: 1.26443496 t(line2) + q(y) + delta(line2.y) = -0.551894266
_eline(line3.x)#4: 1.862007808 t(line3) + q(x) + delta(line3.x)
= -0.865772554
_eline(line3.y)#5: 0.157045418 t(line3) + q(y) + delta(line3.y)
= 0.000421337999999993
QCP_row_for_dist(line1): [ delta(line1.x) ^2
+ delta(line1.y) ^2 - d(line1) ^2 ] <= 0
QCP_row_for_dist(line2): [ delta(line2.x) ^2
+ delta(line2.y) ^2 - d(line2) ^2 ] <= 0
QCP_row_for_dist(line3): [ delta(line3.x) ^2
+ delta(line3.y) ^2 - d(line3) ^2 ] <= 0
Bounds
t(line1) Free
t(line2) Free
t(line3) Free
q(x) Free
q(y) Free
delta(line1.x) Free
delta(line1.y) Free
delta(line2.x) Free
delta(line2.y) Free
delta(line3.x) Free
delta(line3.y) Free
End
Cplex is happy with this model. We see:
Barrier - Optimal: Objective = 1.2652779019e-01.