Saturday, January 28, 2023

Tiny non-convex quadratic model brings solvers to their knees

Here is a very small geometric problem:

Given \(n\) points in 2d space, find the smallest triangle that contains all these points.


Find the smallest triangle containing all points.


This looks like a simple problem. Somewhat to my surprise, my attempt here does not bear that out.

To model this, we need to address two issues:

  1. Calculate the area of a triangle,
  2. Make sure all data points are inside the triangle.

Area of a triangle


Assuming our triangle has corner points \((\color{darkred}x_1,\color{darkred}y_1)\), \((\color{darkred}x_2,\color{darkred}y_2)\) and \((\color{darkred}x_3,\color{darkred}y_3)\), the determinant formula for the area is: \[\color{darkred}A = 0.5 \> {\mathbf{abs}} \> {\mathbf{det}}\begin{bmatrix}\color{darkred}x_1 & \color{darkred}y_1 & 1\\\color{darkred}x_2 & \color{darkred}y_2 & 1 \\ \color{darkred}x_3 & \color{darkred}y_3 & 1\end{bmatrix}\] This can be written as: \[\color{darkred}A=0.5 \cdot |\color{darkred}x_1(\color{darkred}y_2-\color{darkred}y_3) + \color{darkred}x_2(\color{darkred}y_3-\color{darkred}y_1) + \color{darkred}x_3(\color{darkred}y_1-\color{darkred}y_2)|\] The absolute value can be linearized easily. What remains is a non-convex quadratic expression.


Points must be inside the triangle


Here I used a concept called barycentric coordinates [1]. If we can form weights \(\color{darkred}\lambda_k\) such that \[\begin{align}& \color{darkblue}p = \sum_k \color{darkred}\lambda_k \cdot \color{darkred}t_k \\ & \color{darkred}\lambda_k \ge 0 \\ & \sum_k \color{darkred}\lambda_k = 1 \end{align}\] then point \(\color{darkblue}p\) is inside the triangle formed by corner points \(\color{darkred}t_k\). This again introduces a non-convex quadratic expression as both \(\color{darkred}\lambda_k\), and \(\color{darkred}t_k\) are endogenous (i.e. variables). Note that \(\color{darkblue}p\) and \(\color{darkred}t_k\) are two-dimensional, while \(\color{darkred}\lambda_k\) are scalars.


Optimization model


Putting these two things together, we arrive at our optimization model:

 

Smallest triangle model
\[\begin{align}\min\> & \color{darkred}A^+ + \color{darkred}A^- \\ & \color{darkred}A^+ - \color{darkred}A^- = 0.5 \cdot \left[\color{darkred}t_{1,x}(\color{darkred}t_{2,y}-\color{darkred}t_{3,y}) + \color{darkred}t_{2,x}(\color{darkred}t_{3,y}-\color{darkred}t_{1,y}) + \color{darkred}t_{3,x}(\color{darkred}t_{1,y}-\color{darkred}t_{2,y}) \right] \\ & \color{darkblue}p_{i,c} = \sum_k \color{darkred}\lambda_{i,k} \cdot \color{darkred}t_{k,c} && \forall i,c\\ &  \sum_k \color{darkred}\lambda_{i,k} = 1 && \forall i \\ & \color{darkred}A^+, \color{darkred}A^- \ge 0 \\ & \color{darkred}\lambda_{i,k} \ge 0 \\ & \color{darkred}t_{k,c}\> {\mathbf{free}}\\ & c \in \{x,y\}\\ & k\in \{1,2,3\}\\ & i \in \{1,\dots,n\}\end{align}\]

A special case is \(n=3\). We know the optimal solution for this: make the corner points of the triangle equal to the three data points. I would expect this to be fairly easy. We have a non-convex quadratic problem with just 17 variables. We can solve this with a global NLP solver (e.g. Baron, Couenne, SCIP, Antigone, Octeract) or with a global quadratic solver (Gurobi). All these solvers find the optimal solution very quickly, but proving optimality is extremely difficult. Solving to proven optimality takes hours!

The picture at the top used \(n=25\) points. The data and solution for this data set is:

----     76 PARAMETER p  data points

                  x           y

point1       17.175      84.327
point2       55.038      30.114
point3       29.221      22.405
point4       34.983      85.627
point5        6.711      50.021
point6       99.812      57.873
point7       99.113      76.225
point8       13.069      63.972
point9       15.952      25.008
point10      66.893      43.536
point11      35.970      35.144
point12      13.149      15.010
point13      58.911      83.089
point14      23.082      66.573
point15      77.586      30.366
point16      11.049      50.238
point17      16.017      87.246
point18      26.511      28.581
point19      59.396      72.272
point20      62.825      46.380
point21      41.331      11.770
point22      31.421       4.655
point23      33.855      18.210
point24      64.573      56.075
point25      76.996      29.781


----     76 VARIABLE t.L  triangle

                  x           y

corner1      -1.258      93.363
corner2      17.810     -10.334
corner3     136.183      69.896


----     76 VARIABLE area.L  area (using variable splitting)

+ 6902.407


----     76 VARIABLE lambda.L  barycentric coordinates

            corner1     corner2     corner3

point1        0.815       0.059       0.126
point2        0.130       0.534       0.336
point3        0.214       0.655       0.131
point4        0.723       0.015       0.262
point5        0.582       0.418
point6        0.108       0.182       0.710
point7        0.270                   0.730
point8        0.665       0.268       0.067
point9        0.314       0.651       0.035
point10       0.177       0.380       0.443
point11       0.284       0.516       0.199
point12       0.244       0.756
point13       0.562                   0.438
point14       0.629       0.225       0.146
point15       0.002       0.493       0.505
point16       0.559       0.408       0.033
point17       0.847       0.032       0.121
point18       0.283       0.598       0.119
point19       0.467       0.107       0.426
point20       0.225       0.359       0.416
point21       0.053       0.740       0.207
point22       0.049       0.828       0.123
point23       0.152       0.689       0.160
point24       0.298       0.259       0.443
point25                   0.500       0.500


 

Extensions


Initially, I wanted to solve a cover problem with multiple triangles: given \(n\) data points, find a set of \(m\) triangles that cover all points while minimizing the total area of the triangles. After this experiment, I probably should forget about that. Note: I was thinking about relaxing  \(\color{darkred}\lambda_{i,k} \ge 0\) for all but one triangle.

A different approach would be to enumerate and precalculate the smallest triangle over all possible subsets of points. That would mean \(2^n\) sets of points (note that \(2^{25}=33,554,432\)). Many sets have the same convex hull and thus the same covering triangle. A clever enumeration scheme would be called for.   


Conclusion


This problem is, at first sight, not very difficult. However, we need to implement a few geometric concepts that are not obvious. The result is a non-convex quadratic model. Solvers find good (or even optimal) solutions very quickly. Proving optimality is a very different thing: this turns out to be extremely difficult. The \(n=3\) case provides an awe-inspiring example of a tiny quadratic model that just crushes all global solvers.

The remaining questions, of course, are: is this a really bad formulation? Any better ones around? Can we explain why the lower bound remains at 0 until the tree has been fully explored?

References


  1. Barycentric coordinate system, https://en.wikipedia.org/wiki/Barycentric_coordinate_system


Appendix: GAMS model


$onText

 

  Given are n points (2d).

  Find smallest triangle that contains all points.

 

$offText

 

*---------------------------------------------------------------------

* data: points

*---------------------------------------------------------------------

 

set

   i 'points'   /point1*point25/

   'coordinates' /x,y/

;

 

parameter p(i,c'data points';

p(i,c) = uniform(0,100);

 

*---------------------------------------------------------------------

* find smallest triangle to contain all points

*---------------------------------------------------------------------

 

 

sets

   k  'corner points of triangle' /corner1*corner3/

   pm 'plusmin -- used in linearizing abs()' /'+','-'/

;

 

shorthands to make our area calculation easier

singleton sets

   x1(k,c) /'corner1'.'x'/

   x2(k,c) /'corner2'.'x'/

   x3(k,c) /'corner3'.'x'/

   y1(k,c) /'corner1'.'y'/

   y2(k,c) /'corner2'.'y'/

   y3(k,c) /'corner3'.'y'/

;

  

 

variable

   t(k,c)  'triangle'

   z       'objective'

;

 

positive variable

   area(pm)     'area (using variable splitting)'

   lambda(i,k)  'barycentric coordinates'

;

 

equations

   calcArea         'calculate area given its three corner points'

   calcLambda(i,c)  'solve for barycentric coordinates'

   sumLambda(i)     'lambdas need to add up to one'

   obj              'objective'

   order            'order corner points by their x coordinate'

;

 

calcArea..         area('+')-area('-') =e= 0.5*[t(x1)*(t(y2)-t(y3)) + t(x2)*(t(y3)-t(y1)) + t(x3)*(t(y1)-t(y2))];

calcLambda(i,c)..  p(i,c) =e= sum(k, lambda(i,k)*t(k,c));

sumLambda(i)..     sum(k, lambda(i,k)) =e= 1;

obj..              z =e= sum(pm,area(pm));

order(k-1)..       t(k,'x') =g= t(k-1,'x');

 

 

* some reasonable bounds

t.lo(k,c) = -1000;

t.up(k,c) = +1000;

 

 

model m /all/;

option nlp=baron, threads=0, reslim=1000;

solve m minimizing z using nlp;

 

* data + results

display p,t.l,area.l,lambda.l;

 

 

Note that an ordering constraint has been added. Not sure if it helps.

To give you a better idea of how small the \(n=3\) problem is, here is the corresponding LP file:

Minimize
 obj: x7

Subject To
 e1: x8 - x9 + [-0.5 x1 * x4 + 0.5 x1 * x6 + 0.5 x2 * x3 - 0.5 x2 * x5 - 0.5 x3
     * x6 + 0.5 x4 * x5] = 0
 e2:  + [-x1 * x10 - x3 * x11 - x5 * x12] = -17.174713200000003
 e3:  + [-x2 * x10 - x4 * x11 - x6 * x12] = -84.3266708
 e4:  + [-x1 * x13 - x3 * x14 - x5 * x15] = -55.0375356
 e5:  + [-x2 * x13 - x4 * x14 - x6 * x15] = -30.113790400000003
 e6:  + [-x1 * x16 - x3 * x17 - x5 * x18] = -29.221211699999998
 e7:  + [-x2 * x16 - x4 * x17 - x6 * x18] = -22.4052867
 e8: x10 + x11 + x12 = 1
 e9: x13 + x14 + x15 = 1
 e10: x16 + x17 + x18 = 1
 e11: x7 - x8 - x9 = 0
 e12: -x1 + x3 >= 0
 e13: -x3 + x5 >= 0

Bounds
 -1000 <= x1 <= 1000
 -1000 <= x2 <= 1000
 -1000 <= x3 <= 1000
 -1000 <= x4 <= 1000
 -1000 <= x5 <= 1000
 -1000 <= x6 <= 1000
 x7 Free
 0 <= x10 <= 1
 0 <= x11 <= 1
 0 <= x12 <= 1
 0 <= x13 <= 1
 0 <= x14 <= 1
 0 <= x15 <= 1
 0 <= x16 <= 1
 0 <= x17 <= 1
 0 <= x18 <= 1

End


That we can't solve this tiny little model to proven optimality in a reasonably acceptable time is quite scary. 

11 comments:

  1. You can probably just work with the convex hull of the points instead of all of them. I know minimum bounding rectangle/circle algorithms exist, but have not regularly seen a triangle one. Quick google turns up https://link.springer.com/article/10.1007/s40314-014-0198-8.

    For the multiple triangles do not know either. I don't see an easy way to go from Delauney (or maybe minimum spanning tree) to a way to superimpose triangles.

    ReplyDelete
  2. I wonder if there is a combinatorial way to formulate this problem. Can it be shown that at least two sides of the triangle will have 2+ points on them? Then, after these two sides of the triangle are fixed (which could be enumerated or captured via binary variables), is it easy to decide the third side?

    ReplyDelete
    Replies
    1. Yes, it can be shown that it suffices to consider triangles whose two sides have at least two points on them. In this case, either the third side also passes through two points, or the midpoint of the third side is one of the given points. This observation can help us to solve a cover problem with multiple triangles for small n. Indeed, we can just generate all possible specified triangles as candidates.

      Delete
  3. I think it would help greatly even to assert that at least three of the points fall on the boundary of the triangle, some of the lambdas are zero. And of course preprocess the points and use only the convex hull. But, as demonstrated by the case with only 3 points, the convex hull isn't a game changer here.

    ReplyDelete
  4. A better model could be to have the three lines represented directly as ax+by=c. Then the constraints are all linear equalities, but the objective function is more complicated.

    ReplyDelete
    Replies
    1. I am not convinced making sure points are inside is actually so easy with this.

      Delete
    2. Unless I miss something obvious a_jx_i+b_jy_i >= c_j for j=1, 2, 3 (lines) and i = 1,...,m (points) should do the trick. (In a confusing way a_j, b_j, c_j are variables, while x_i and y_i are data.)

      Delete
    3. That may describe a point outside. The great thing about building models is that it forces you to think carefully about the details.

      Delete
    4. Ah, I see what you mean now.

      Delete
  5. One of the problems is that there are 6 symmetric solutions. One has to realize the symmetry or identify 6 local optima before the optimality proof finishes.

    ReplyDelete
    Replies
    1. The GAMS model has actually an ordering constraint: corner points are ordered by their x-coordinate. This does not make much difference.

      Delete