Convex hull as optimization problem
In the previous posts, the construction of a convex hull played a significant role: it was an easy way to reduce the size of the data sets, often by a large amount. Somehow, it escaped me that we can formulate this as a conceptually simple optimization problem.The idea is that we can locate each interior point by forming a linear combination of the convex hull points: \[\begin{align}& {\color{darkblue}p}_i = \sum_j {\color{darkred}\lambda}_{i,j} {\color{darkblue}p}_j \\ & \sum_j {\color{darkred}\lambda}_{i,j} = 1 \\ & {\color{darkred}\lambda}_{i,j} \in [0,1] \end{align}\] where \(i\) is an interior point and \(j\) indicates a point on the hull. This looks linear, but things become a little bit more complicated when we realize that being an interior or hull point is endogenous. A high-level model can look like:
| High-level model |
|---|
| \[\begin{align}\min&\sum_i {\color{darkred}h}_i \\ & {\color{darkred}h}_i + {\color{darkred}{nh}}_i = 1 \\ & {\color{darkblue}p}_{i,c} = \sum_{j|{\color{darkred}h}_j=1} {\color{darkred}\lambda}_{i,j} {\color{darkblue}p}_{j,c} && i|{\color{darkred}nh}_j=1 \\ & \sum_{j|{\color{darkred}h}_j=1} {\color{darkred}\lambda}_{i,j} = 1 && i|{\color{darkred}nh}_j=1 \\ & {\color{darkred}h}_i, {\color{darkred}{nh}}_i \in \{0,1\} \\ & {\color{darkred}\lambda}_{i,j} \in [0,1] \end{align}\] |
| MINLP model |
|---|
| \[\begin{align}\min&\sum_i {\color{darkred}h}_i \\ & {\color{darkred}h}_i + {\color{darkred}{nh}}_i = 1 \\ & {\color{darkblue}p}_{i,c} {\color{darkred}{nh}}_i = \sum_{j} {\color{darkred}\lambda}_{i,j} {\color{darkblue}p}_{j,c} {\color{darkred}h}_j && \forall i\\ & \sum_{j} {\color{darkred}\lambda}_{i,j} {\color{darkred}h}_j = {\color{darkred}nh}_i && \forall i \\ & {\color{darkred}h}_i, {\color{darkred}{nh}}_i \in \{0,1\}\\ & {\color{darkred}\lambda}_{i,j} \in [0,1]\end{align}\] |
---- 93 PARAMETER p data points x y hull/scipy hull/minlp point1 4.973 9.639 1.000 1.000 point2 8.739 8.443 1.000 1.000 point3 1.912 4.945 1.000 1.000 point4 2.346 3.810 point5 6.009 1.515 1.000 1.000 point6 5.077 4.633 point7 3.591 1.526 point8 3.664 8.260 1.000 1.000 point9 0.239 0.593 1.000 1.000 point10 2.848 1.140 point11 8.905 3.743 1.000 1.000 point12 2.341 1.477 point13 0.191 0.784 1.000 1.000 point14 9.280 5.550 1.000 1.000 point15 6.164 3.605 count 9.000 9.000
It is noted that not all hull points are present in the weights:
---- 89 VARIABLE h.L hull point1 1.000, point2 1.000, point3 1.000, point5 1.000, point8 1.000, point9 1.000, point11 1.000 point13 1.000, point14 1.000 ---- 89 VARIABLE nh.L interior point4 1.000, point6 1.000, point7 1.000, point10 1.000, point12 1.000, point15 1.000 ---- 89 VARIABLE lambda.L weights point1 point2 point5 point11 point13 point4 0.240 0.118 0.642 point6 0.460 0.109 0.431 point7 0.331 0.169 0.500 point10 0.439 0.012 0.549 point12 0.030 0.227 0.743 point15 0.167 0.522 0.311
We can linearize this model so that we have a linear MIP model and don't need global solvers. However, for prototyping MIP models, using a simpler exploratory MINLP model is something I do more often. It helps to check if the underlying idea is actually working, before spending time on developing linearizations.
Doing local searchPreprocessing found feasible solution with value 9.00000Solving bounding LPProblem solved during preprocessingLower bound is 9.00000
| Linear MIP model |
|---|
| \[\begin{align}\min&\sum_i {\color{darkred}h}_i \\ & {\color{darkred}h}_i + {\color{darkred}{nh}}_i = 1 \\ & {\color{darkred}\lambda}_{i,j} \le {\color{darkred}h}_j && \forall i,j \\ & {\color{darkblue}p}_{i,c} {\color{darkred}{nh}}_i = \sum_{j} {\color{darkred}\lambda}_{i,j} {\color{darkblue}p}_{j,c} && \forall i\\ & \sum_{j} {\color{darkred}\lambda}_{i,j} = {\color{darkred}nh}_i && \forall i \\ & {\color{darkred}h}_i, {\color{darkred}{nh}}_i \in \{0,1\} \\ & {\color{darkred}\lambda}_{i,j} \in [0,1] \end{align}\] |
---- 121 PARAMETER p data points x y hull/scipy hull/minlp hull/mip point1 4.973 9.639 1.000 1.000 1.000 point2 8.739 8.443 1.000 1.000 1.000 point3 1.912 4.945 1.000 1.000 1.000 point4 2.346 3.810 point5 6.009 1.515 1.000 1.000 1.000 point6 5.077 4.633 point7 3.591 1.526 point8 3.664 8.260 1.000 1.000 1.000 point9 0.239 0.593 1.000 1.000 1.000 point10 2.848 1.140 point11 8.905 3.743 1.000 1.000 1.000 point12 2.341 1.477 point13 0.191 0.784 1.000 1.000 1.000 point14 9.280 5.550 1.000 1.000 1.000 point15 6.164 3.605 count 9.000 9.000 9.000
The MIP solver also did not need any nodes. Using random points it is easy to generate a larger test case.
\(n=100\) data set
---- 117 PARAMETER p data points x y hull/scipy hull/minlp hull/mip point1 4.973 9.639 point2 8.739 8.443 point3 1.912 4.945 point4 2.346 3.810 point5 6.009 1.515 point6 5.077 4.633 point7 3.591 1.526 point8 3.664 8.260 point9 0.239 0.593 1.000 1.000 1.000 point10 2.848 1.140 point11 8.905 3.743 point12 2.341 1.477 point13 0.191 0.784 1.000 1.000 1.000 point14 9.280 5.550 point15 6.164 3.605 point16 5.214 4.320 point17 7.499 8.174 point18 0.160 3.729 1.000 1.000 1.000 point19 6.971 7.669 point20 8.875 8.117 point21 7.897 8.036 point22 0.166 3.932 1.000 1.000 1.000 point23 9.601 7.215 1.000 1.000 1.000 point24 8.147 8.375 point25 7.745 5.283 point26 6.557 3.924 point27 4.851 4.026 point28 4.346 1.935 point29 1.290 4.876 point30 7.912 6.856 point31 3.026 0.001 1.000 1.000 1.000 point32 4.463 8.148 point33 2.080 3.360 point34 2.816 1.565 point35 6.203 8.599 point36 6.153 3.857 point37 7.633 7.315 point38 0.798 0.829 point39 0.192 2.454 point40 3.341 3.038 point41 8.349 3.863 point42 0.531 5.056 point43 1.919 2.345 point44 9.347 7.861 point45 9.435 1.718 1.000 1.000 1.000 point46 1.817 2.026 point47 5.323 5.065 point48 2.108 6.382 point49 2.959 5.533 point50 0.342 8.075 1.000 1.000 1.000 point51 3.551 9.887 1.000 1.000 1.000 point52 8.136 7.153 point53 2.435 5.417 point54 5.019 1.108 point55 1.686 2.869 point56 8.416 8.060 point57 5.631 1.570 point58 2.715 0.339 point59 7.184 7.050 point60 1.577 7.522 point61 9.554 7.855 point62 2.566 0.880 point63 7.997 2.193 point64 9.556 8.034 1.000 1.000 1.000 point65 4.599 2.022 point66 8.957 5.241 point67 8.720 9.062 1.000 1.000 1.000 point68 3.519 3.229 point69 9.245 6.341 point70 4.141 2.687 point71 2.817 7.412 point72 6.571 4.559 point73 3.806 4.961 point74 8.999 5.323 point75 0.602 3.960 point76 4.302 9.600 point77 2.717 8.777 point78 1.872 4.952 point79 3.734 4.279 point80 9.519 5.120 point81 4.618 4.190 point82 0.915 9.232 1.000 1.000 1.000 point83 9.528 6.382 point84 5.353 8.983 point85 0.462 4.249 point86 8.164 4.834 point87 8.054 3.067 point88 8.594 7.318 point89 5.609 3.905 point90 2.741 4.446 point91 9.810 2.776 1.000 1.000 1.000 point92 7.988 9.737 1.000 1.000 1.000 point93 5.810 7.965 point94 8.287 7.178 point95 4.339 2.033 point96 2.708 6.648 point97 1.108 8.467 point98 2.175 2.350 point99 8.997 8.511 point100 8.928 2.236 count 14.000 14.000 14.000
GAMS Models
$ontext
Calculate convex hull of a set of points using an MINLP or MIP model
$offtext
option seed = 12345;
option minlp=baron, mip=cplex;
*----------------------------------------------------- * data *-----------------------------------------------------
set i 'data points' /point1*point15/ k 'coordinates' /x,y/ ;
parameter p(*,*) 'data points'; p(i,k) = uniform(0,10); display p;
*----------------------------------------------------- * convex hull python code *-----------------------------------------------------
set hull(i) 'convex hull computed by scipy';
embeddedCode Python: import scipy as sp import numpy as np import gams.transfer as gt
print("Computing convex hull...") i = list(gams.get("i")) p = gt.Container(gams.db)["p"].toDense() hull = sp.spatial.ConvexHull(p) h = [i[pt] for pt in hull.vertices] gams.set("hull",h) endEmbeddedCode hull
display hull;
p(i,'hull/scipy') = hull(i); p('count','hull/scipy') = card(hull); display p; *----------------------------------------------------- * minlp convex hull model *-----------------------------------------------------
alias(i,j);
variable lambda(i,j) 'weights' ; lambda.lo(i,j) = 0; lambda.up(i,j) = 1; lambda.fx(i,i) = 0;
variable z 'objective';
binary variable h(i) 'hull' nh(i) 'interior' ;
Equation choose(i) 'point is either hull or interior' inside(i,k) 'convex combination, only for nh(i)=1' sum_lambda(i) 'normalize weights, only fot n(i)=1' count 'number of hull points' ;
choose(i).. h(i)+nh(i) =e= 1;
inside(i,k).. p(i,k)*nh(i) =e= sum(j, lambda(i,j)*p(j,k)*h(j));
sum_lambda(i).. sum(j, lambda(i,j)*h(j)) =e= nh(i); count.. z =e= sum(i, h(i));
model m /all/;
solve m minimizing z using minlp;
display h.l,nh.l,lambda.l;
p(i,'hull/minlp') = h.l(i)>0.5; p('count','hull/minlp') = sum(i$(h.l(i)>0.5),1); display p;
*----------------------------------------------------- * mip convex hull model *-----------------------------------------------------
Equations lambda_bnd(i,j) 'h(j)=0 => lambda(i,j)=0' inside2(i,k) 'linearization uses lambda_bnd' sum_lambda2 'linear version' ;
lambda_bnd(i,j).. lambda(i,j) =l= h(j); inside2(i,k).. p(i,k)*nh(i) =e= sum(j, lambda(i,j)*p(j,k)); sum_lambda2(i).. sum(j, lambda(i,j)) =e= nh(i); model m2 /choose,lambda_bnd,inside2,sum_lambda2,count/; solve m2 minimizing z using mip;
display h.l,lambda.l;
p(i,'hull/mip') = h.l(i)>0.5; p('count','hull/mip') = sum(i$(h.l(i)>0.5),1); display p; |
Conclusions
Computing the convex hull of a set of point can be done with standard off-the-shelf algorithms. It is an interesting exercise to develop mathematical programming models that achieve the same. The MINLP and MIP models developed here seem surprisingly efficient.
Is this approach new? I could not find any references for this. Of course, one reason could be: this is just not very useful at all. Another reason could be: this is obvious. Anyhow, I think this was an interesting exercise, and not obvious for me. Useless models are some of my favorites.
References
- P. M. Pardalos, Y. Li, W. W. Hager, Linear programming approaches to the convex hull problem in \(Rm\), Computers & Mathematics with Applications, Volume 29, Issue 7, 1995, Pages 23-29.
No comments:
Post a Comment