Thursday, April 30, 2026

Convex hull models

Convex hull as optimization problem

In the previous posts, the construction of a convex hull played a significant role: it was an easy way to reduce the size of the data sets, often by a large amount. Somehow, it escaped me that we can formulate this as a conceptually simple optimization problem. 


The idea is that we can locate each interior point by forming a linear combination of the convex hull points: \[\begin{align}& {\color{darkblue}p}_i = \sum_j {\color{darkred}\lambda}_{i,j} {\color{darkblue}p}_j \\ & \sum_j {\color{darkred}\lambda}_{i,j} = 1 \\ & {\color{darkred}\lambda}_{i,j} \in [0,1] \end{align}\] where \(i\) is an interior point and \(j\) indicates a point on the hull. This looks linear, but things become a little bit more complicated when we realize that being an interior or hull point is endogenous. A high-level model can look like: 

High-level model
\[\begin{align}\min&\sum_i {\color{darkred}h}_i \\ & {\color{darkred}h}_i + {\color{darkred}{nh}}_i = 1 \\ & {\color{darkblue}p}_{i,c} = \sum_{j|{\color{darkred}h}_j=1} {\color{darkred}\lambda}_{i,j} {\color{darkblue}p}_{j,c} && i|{\color{darkred}nh}_j=1 \\ & \sum_{j|{\color{darkred}h}_j=1} {\color{darkred}\lambda}_{i,j} = 1 && i|{\color{darkred}nh}_j=1 \\ & {\color{darkred}h}_i, {\color{darkred}{nh}}_i \in \{0,1\} \\ & {\color{darkred}\lambda}_{i,j} \in [0,1] \end{align}\]

Here \({\color{darkred}h}_i=1\) means point \(i\) is part of the convex hull. \({\color{darkred}{nh}}_i=1\) means point \(i\) is an interior point. We want to prevent that all points are labeled as a hull point. Therefore we minimize the number of points that are part of the convex hull. The objective is essential.

Unfortunately, this model is not something we can directly feed to a solver. However, it is not too difficult to form an MINLP model that can actually be used:

MINLP model
\[\begin{align}\min&\sum_i {\color{darkred}h}_i \\ & {\color{darkred}h}_i + {\color{darkred}{nh}}_i = 1 \\ & {\color{darkblue}p}_{i,c} {\color{darkred}{nh}}_i = \sum_{j} {\color{darkred}\lambda}_{i,j} {\color{darkblue}p}_{j,c} {\color{darkred}h}_j && \forall i\\ & \sum_{j} {\color{darkred}\lambda}_{i,j} {\color{darkred}h}_j = {\color{darkred}nh}_i && \forall i \\ & {\color{darkred}h}_i, {\color{darkred}{nh}}_i \in \{0,1\}\\ & {\color{darkred}\lambda}_{i,j} \in [0,1]\end{align}\]

This model has a few nonconvex quadratic terms. A small data set illustrates we get the correct convex hull:

----     93 PARAMETER p  data points

                  x           y  hull/scipy  hull/minlp

point1        4.973       9.639       1.000       1.000
point2        8.739       8.443       1.000       1.000
point3        1.912       4.945       1.000       1.000
point4        2.346       3.810
point5        6.009       1.515       1.000       1.000
point6        5.077       4.633
point7        3.591       1.526
point8        3.664       8.260       1.000       1.000
point9        0.239       0.593       1.000       1.000
point10       2.848       1.140
point11       8.905       3.743       1.000       1.000
point12       2.341       1.477
point13       0.191       0.784       1.000       1.000
point14       9.280       5.550       1.000       1.000
point15       6.164       3.605
count                                 9.000       9.000

It is noted that not all hull points are present in the weights:

----     89 VARIABLE h.L  hull

point1  1.000,    point2  1.000,    point3  1.000,    point5  1.000,    point8  1.000,    point9  1.000,    point11 1.000
point13 1.000,    point14 1.000


----     89 VARIABLE nh.L  interior

point4  1.000,    point6  1.000,    point7  1.000,    point10 1.000,    point12 1.000,    point15 1.000


----     89 VARIABLE lambda.L  weights

             point1      point2      point5     point11     point13

point4        0.240       0.118                               0.642
point6                    0.460                   0.109       0.431
point7                                0.331       0.169       0.500
point10                               0.439       0.012       0.549
point12                               0.030       0.227       0.743
point15                   0.167                   0.522       0.311

We can linearize this model so that we have a linear MIP model and don't need global solvers. However, for prototyping MIP models, using a simpler exploratory MINLP model is something I do more often. It helps to check if the underlying idea is actually working, before spending time on developing linearizations. 

It is noted that the MINLP model is quite reasonable. For this small data set I see in the Baron log:

 Doing local search
 Preprocessing found feasible solution with value 9.00000
 Solving bounding LP
 Problem solved during preprocessing
 Lower bound is    9.00000

So it did not need any Branch & Bound nodes.

A linearized model can look like:

Linear MIP model
\[\begin{align}\min&\sum_i {\color{darkred}h}_i \\ & {\color{darkred}h}_i + {\color{darkred}{nh}}_i = 1 \\ & {\color{darkred}\lambda}_{i,j} \le {\color{darkred}h}_j && \forall i,j  \\ & {\color{darkblue}p}_{i,c} {\color{darkred}{nh}}_i = \sum_{j} {\color{darkred}\lambda}_{i,j} {\color{darkblue}p}_{j,c}  && \forall i\\ & \sum_{j} {\color{darkred}\lambda}_{i,j}  = {\color{darkred}nh}_i && \forall i \\ & {\color{darkred}h}_i, {\color{darkred}{nh}}_i \in \{0,1\} \\ & {\color{darkred}\lambda}_{i,j} \in [0,1] \end{align}\]

There are no big-M's in the model, which is good. The solution is the same as before:

----    121 PARAMETER p  data points

                  x           y  hull/scipy  hull/minlp    hull/mip

point1        4.973       9.639       1.000       1.000       1.000
point2        8.739       8.443       1.000       1.000       1.000
point3        1.912       4.945       1.000       1.000       1.000
point4        2.346       3.810
point5        6.009       1.515       1.000       1.000       1.000
point6        5.077       4.633
point7        3.591       1.526
point8        3.664       8.260       1.000       1.000       1.000
point9        0.239       0.593       1.000       1.000       1.000
point10       2.848       1.140
point11       8.905       3.743       1.000       1.000       1.000
point12       2.341       1.477
point13       0.191       0.784       1.000       1.000       1.000
point14       9.280       5.550       1.000       1.000       1.000
point15       6.164       3.605
count                                 9.000       9.000       9.000

The MIP solver also did not need any nodes. Using random points it is easy to generate a larger test case.

\(n=100\) data set
----    117 PARAMETER p  data points

                   x           y  hull/scipy  hull/minlp    hull/mip

point1         4.973       9.639
point2         8.739       8.443
point3         1.912       4.945
point4         2.346       3.810
point5         6.009       1.515
point6         5.077       4.633
point7         3.591       1.526
point8         3.664       8.260
point9         0.239       0.593       1.000       1.000       1.000
point10        2.848       1.140
point11        8.905       3.743
point12        2.341       1.477
point13        0.191       0.784       1.000       1.000       1.000
point14        9.280       5.550
point15        6.164       3.605
point16        5.214       4.320
point17        7.499       8.174
point18        0.160       3.729       1.000       1.000       1.000
point19        6.971       7.669
point20        8.875       8.117
point21        7.897       8.036
point22        0.166       3.932       1.000       1.000       1.000
point23        9.601       7.215       1.000       1.000       1.000
point24        8.147       8.375
point25        7.745       5.283
point26        6.557       3.924
point27        4.851       4.026
point28        4.346       1.935
point29        1.290       4.876
point30        7.912       6.856
point31        3.026       0.001       1.000       1.000       1.000
point32        4.463       8.148
point33        2.080       3.360
point34        2.816       1.565
point35        6.203       8.599
point36        6.153       3.857
point37        7.633       7.315
point38        0.798       0.829
point39        0.192       2.454
point40        3.341       3.038
point41        8.349       3.863
point42        0.531       5.056
point43        1.919       2.345
point44        9.347       7.861
point45        9.435       1.718       1.000       1.000       1.000
point46        1.817       2.026
point47        5.323       5.065
point48        2.108       6.382
point49        2.959       5.533
point50        0.342       8.075       1.000       1.000       1.000
point51        3.551       9.887       1.000       1.000       1.000
point52        8.136       7.153
point53        2.435       5.417
point54        5.019       1.108
point55        1.686       2.869
point56        8.416       8.060
point57        5.631       1.570
point58        2.715       0.339
point59        7.184       7.050
point60        1.577       7.522
point61        9.554       7.855
point62        2.566       0.880
point63        7.997       2.193
point64        9.556       8.034       1.000       1.000       1.000
point65        4.599       2.022
point66        8.957       5.241
point67        8.720       9.062       1.000       1.000       1.000
point68        3.519       3.229
point69        9.245       6.341
point70        4.141       2.687
point71        2.817       7.412
point72        6.571       4.559
point73        3.806       4.961
point74        8.999       5.323
point75        0.602       3.960
point76        4.302       9.600
point77        2.717       8.777
point78        1.872       4.952
point79        3.734       4.279
point80        9.519       5.120
point81        4.618       4.190
point82        0.915       9.232       1.000       1.000       1.000
point83        9.528       6.382
point84        5.353       8.983
point85        0.462       4.249
point86        8.164       4.834
point87        8.054       3.067
point88        8.594       7.318
point89        5.609       3.905
point90        2.741       4.446
point91        9.810       2.776       1.000       1.000       1.000
point92        7.988       9.737       1.000       1.000       1.000
point93        5.810       7.965
point94        8.287       7.178
point95        4.339       2.033
point96        2.708       6.648
point97        1.108       8.467
point98        2.175       2.350
point99        8.997       8.511
point100       8.928       2.236
count                                 14.000      14.000      14.000

Also in this case no B&B nodes were needed by the MINLP and MIP models solved with Baron and Cplex respectively.


GAMS Models


$ontext

 

   Calculate convex hull of a set of points using an MINLP or MIP model

 

$offtext

 

option seed = 12345;

 

option minlp=baron, mip=cplex;

 

*-----------------------------------------------------

data

*-----------------------------------------------------

 

set

   i 'data points' /point1*point15/

   'coordinates' /x,y/

;

 

parameter p(*,*'data points';

p(i,k) = uniform(0,10);

display p;

 

*-----------------------------------------------------

convex hull python code

*-----------------------------------------------------

 

set hull(i'convex hull computed by scipy';

 

embeddedCode Python:

import scipy as sp

import numpy as np

import gams.transfer as gt

 

print("Computing convex hull...")

i = list(gams.get("i"))

p = gt.Container(gams.db)["p"].toDense()

hull = sp.spatial.ConvexHull(p)

h = [i[pt] for pt in hull.vertices]

gams.set("hull",h)

endEmbeddedCode hull

 

display hull;

 

p(i,'hull/scipy') = hull(i);

p('count','hull/scipy') = card(hull);

display p;

 

 

*-----------------------------------------------------

minlp convex hull model

*-----------------------------------------------------

 

alias(i,j);

 

variable

  lambda(i,j'weights'

;

lambda.lo(i,j) = 0;

lambda.up(i,j) = 1;

lambda.fx(i,i) = 0;

 

variable 'objective';

 

binary variable

   h(i)    'hull'

   nh(i)   'interior'

;

 

Equation

   choose(i)     'point is either hull or interior'

   inside(i,k)   'convex combination, only for nh(i)=1'

   sum_lambda(i'normalize weights, only fot n(i)=1'

   count         'number of hull points' 

;

 

choose(i).. h(i)+nh(i) =e= 1;

 

inside(i,k)..

    p(i,k)*nh(i) =e= sum(j, lambda(i,j)*p(j,k)*h(j));

 

sum_lambda(i).. sum(j, lambda(i,j)*h(j)) =e= nh(i);

   

count.. z =e= sum(i, h(i));

 

model m /all/;

 

 

solve m minimizing z using minlp;

 

display h.l,nh.l,lambda.l;

 

p(i,'hull/minlp') = h.l(i)>0.5;

p('count','hull/minlp') = sum(i$(h.l(i)>0.5),1);

display p;

 

 

*-----------------------------------------------------

mip convex hull model

*-----------------------------------------------------

 

Equations

   lambda_bnd(i,j'h(j)=0 => lambda(i,j)=0'

   inside2(i,k)    'linearization uses lambda_bnd'

   sum_lambda2     'linear version'

;

 

lambda_bnd(i,j)..  lambda(i,j) =l= h(j);

inside2(i,k)..     p(i,k)*nh(i) =e= sum(j, lambda(i,j)*p(j,k));

sum_lambda2(i)..   sum(j, lambda(i,j)) =e= nh(i);

  

model m2 /choose,lambda_bnd,inside2,sum_lambda2,count/;

solve m2 minimizing z using mip;

 

display h.l,lambda.l;

 

p(i,'hull/mip') = h.l(i)>0.5;

p('count','hull/mip') = sum(i$(h.l(i)>0.5),1);

display p;

 

 


Conclusions

Computing the convex hull of a set of point can be done with standard off-the-shelf algorithms. It is an interesting exercise to develop mathematical programming models that achieve the same. The MINLP and MIP models developed here seem surprisingly efficient.

Is this approach new? I could not find any references for this. Of course, one reason could be: this is just not very useful at all. Another reason could be: this is obvious. Anyhow, I think this was an interesting exercise, and not obvious for me. Useless models are some of my favorites. 


References


  1. P. M. Pardalos, Y. Li, W. W. Hager, Linear programming approaches to the convex hull problem in \(Rm\), Computers & Mathematics with Applications, Volume 29, Issue 7, 1995, Pages 23-29.

No comments:

Post a Comment