Wednesday, June 30, 2021

Arranging points on a line

The problem stated in [1] is:



In the original problem, the poster talked about the distance between neighbors. But we don't know in advance what the neighboring points are. Of course, we can just generalize and talk about any two points. 

This problem looks deceivingly simple. The modeling has some interesting angles.

We will attack this problem in different ways:
  • An MINLP problem. The main idea is to get rid of the abs() function as this is non-differentiable.
  • A MIP model using binary variables or SOS1 variables. Again, we reformulate the abs() function.
  • Apply a fixing strategy to reduce the model complexity. Good solvers actually don't need this.
  • A genetic algorithm (ga) approach.


Data


To do some experiments, I generated a data set with 50 points. Their ranges are:

----     15 PARAMETER bounds  ranges

i1 .lo 13.740,    i1 .up 23.656,    i2 .lo 67.461,    i2 .up 78.799,    i3 .lo 44.030,    i3 .up 53.442
i4 .lo 24.091,    i4 .up 28.349,    i5 .lo 23.377,    i5 .up 24.673,    i6 .lo 17.924,    i6 .up 19.462
i7 .lo 27.986,    i7 .up 37.605,    i8 .lo 68.502,    i8 .up 76.681,    i9 .lo  5.369,    i9 .up  5.842
i10.lo 40.017,    i10.up 51.902,    i11.lo 79.849,    i11.up 80.941,    i12.lo 46.299,    i12.up 48.934
i13.lo 79.291,    i13.up 87.175,    i14.lo 60.980,    i14.up 72.233,    i15.lo 10.455,    i15.up 13.127
i16.lo 51.178,    i16.up 51.690,    i17.lo 12.761,    i17.up 21.538,    i18.lo 20.006,    i18.up 29.325
i19.lo 53.514,    i19.up 59.355,    i20.lo 34.829,    i20.up 40.209,    i21.lo 28.776,    i21.up 32.422
i22.lo 28.115,    i22.up 31.812,    i23.lo 10.519,    i23.up 12.477,    i24.lo 12.008,    i24.up 26.010
i25.lo 47.129,    i25.up 52.828,    i26.lo 66.471,    i26.up 78.222,    i27.lo 18.465,    i27.up 22.966
i28.lo 53.259,    i28.up 55.141,    i29.lo 62.069,    i29.up 73.302,    i30.lo 24.293,    i30.up 25.331
i31.lo  8.839,    i31.up 11.870,    i32.lo 40.191,    i32.up 40.267,    i33.lo 12.814,    i33.up 16.858
i34.lo 69.797,    i34.up 77.295,    i35.lo 21.209,    i35.up 23.478,    i36.lo 22.865,    i36.up 25.478
i37.lo 47.516,    i37.up 52.476,    i38.lo 57.818,    i38.up 62.571,    i39.lo 50.260,    i39.up 55.091
i40.lo 37.104,    i40.up 51.563,    i41.lo 33.065,    i41.up 47.969,    i42.lo  9.416,    i42.up 14.964
i43.lo 25.137,    i43.up 30.730,    i44.lo  3.724,    i44.up 15.304,    i45.lo 27.084,    i45.up 33.034
i46.lo 14.568,    i46.up 28.264,    i47.lo 51.658,    i47.up 53.452,    i48.lo 44.860,    i48.up 55.892
i49.lo 61.597,    i49.up 62.428,    i50.lo 23.824,    i50.up 32.469


The data looks like:


In the picture above we see our random data set. On the right, I ordered the intervals by their lower bound. Just this small step makes the picture much more intuitive. We see there is quite some overlap, and also that some intervals are really small. In the original post, the author was really interested in an algorithm for this. I don't know if there is an easy algorithm for this problem, especially one that finds optimal solutions. Formulating this as a formal optimization problem is my line of attack.

High-level model


A high-level model that defines our problem can look like:

 

High-level Model
\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le |\color{darkred}x_i - \color{darkred}x_j| && \forall i \lt j \\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \end{align}\]


We only need to compare points \(i\) and \(j\) if \(i\lt j\) (otherwise we would be checking each pair twice). Modeling the absolute value is interesting. 


MINLP Model


In order to be able to use a standard MINLP solver, we can formulate:


MINLP Model
\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \color{darkred}\delta_{i,j} (\color{darkred}x_i-\color{darkred}x_j) + (1-\color{darkred}\delta_{i,j})(\color{darkred}x_j-\color{darkred}x_i)  && \forall i\lt j\\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \\ & \color{darkred}\delta_{i,j} \in \{0,1\}\end{align}\]


I have some thoughts about this model.
  • The model has no abs() function. That is good as abs() is non-differentiable. See [2] for some problems we can see if we use abs() directly.
  • I solved this model with Baron. Another possible candidate is Gurobi as all the non-linearities are quadratic. Note that things are nonconvex.
  • In some cases, the algorithm may initially have a negative objective. This is the case when it just chooses \(\color{darkred}\delta_{i,j}=1\) for cases where range \(i\) is to the left of range \(j\). In that case, \(\color{darkred}x_i-\color{darkred}x_j\) is negative. (Or the other way around:  it chooses \(\color{darkred}\delta_{i,j}=0\) for cases where range \(i\) is to the right of range \(j\)). One simple fix is to add the lowerbound: \[\color{darkred}z \ge 0\]
  • Another, possibly more impactful change can be devised by observing that if two ranges \(i\) and \(j\) don't overlap, we already know which branch to choose. So we can add the bounds: \[\begin{align} & \color{darkblue}\ell_j \gt \color{darkblue}u_i\Rightarrow \color{darkred}\delta_{i,j}=0 && \forall i\lt j\\ & \color{darkblue}\ell_i \gt \color{darkblue}u_j\Rightarrow \color{darkred}\delta_{i,j}=1 && \forall i\lt j  \end{align} \] Note that this is just fixing some of the binary variables ahead of solving the problem. So, using our data set, how many binary variables could be fixed due to non-overlapping ranges? I kept track:

----     77 PARAMETER fixed  statistics on fixed variables

total    1225,    fixed(0)  529,    fixed(1)  493,    unfixed   203


This means that of a total of 1225 binary variables, we could fix 529 to zero and 493 to one. Only 203 binary variables are left. The question is of course: will this help the model, or are solvers smart enough that we don't need to do this fixing? We have to try this out.


MIP Models


We can linearize the absolute value. Admittedly this requires a bit of effort. 

The constraint \(\color{darkred}y=|\color{darkred}x|\) can be interpreted as:\[\begin{align} & \color{darkred}y\ge \color{darkred}x  \>{\bf and}\>\color{darkred}y \ge -\color{darkred}x \\ & \color{darkred}y\le \color{darkred}x  \>{\bf or}\>\color{darkred}y \le -\color{darkred}x \end{align}\] This can be linearized as:  \[\begin{align} & \color{darkred}y \ge \color{darkred}x \\ &  \color{darkred}y \ge -\color{darkred}x \\ &\color{darkred}y \le \color{darkred}x + \color{darkblue}M\cdot \color{darkred}\delta \\ &\color{darkred}y \le -\color{darkred}x + \color{darkblue}M\cdot (1-\color{darkred}\delta) \\ &  \color{darkred}y \ge 0 \\ & \color{darkred}\delta \in \{0,1\}\end{align}\]


When we apply this to our model we can do the following:

MIP Model with binary variables
\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \color{darkred}d_{i,j} && \forall i \lt j \\  & \color{darkred}d_{i,j} \le \color{darkred}x_i - \color{darkred}x_j + \color{darkblue}M_{1,i,j} \cdot (1-\color{darkred}\delta_{i,j}) && \forall i \lt j \\ & \color{darkred}d_{i,j}\le \color{darkred}x_j-\color{darkred}x_i + \color{darkblue}M_{2,i,j} \cdot \color{darkred}\delta_{i,j} && \forall i \lt j \\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \\ & \color{darkred}\delta_{i,j}\in \{0,1\} \\ & \color{darkred}d_{i,j} \ge 0  \end{align}\]


There are quite a few things to say about this model:
  • Tight values for the big-M's are \[\begin{align}&\color{darkblue}M_{1,i,j} = 2\left(\color{darkblue}u_j-\color{darkblue}\ell_i\right)\\ &\color{darkblue}M_{2,i,j} = 2\left(\color{darkblue}u_i-\color{darkblue}\ell_j\right) \end{align}\] The value for \(\color{darkblue}M_{1,i,j}\) can be derived by setting \(\color{darkred}\delta_{i,j}=0\), so we have \[\begin{cases} \color{darkred}d_{i,j} \le \color{darkred}x_i - \color{darkred}x_j + \color{darkblue}M_{1,i,j} \\ \color{darkred}d_{i,j}\le \color{darkred}x_j-\color{darkred}x_i \end{cases}\] From that we can see: \[\begin{align} &  \color{darkblue}M_{1,i,j} \ge  \color{darkred}x_j- \color{darkred}x_i +  \color{darkred}d_{i,j} \\ \Rightarrow\> & \color{darkblue}M_{1,i,j} = \max(\color{darkred}x_j- \color{darkred}x_i) +\max(\color{darkred}x_j- \color{darkred}x_i) \\ \Rightarrow \>&\color{darkblue}M_{1,i,j} = 2\left(\color{darkblue}u_j-\color{darkblue}\ell_i\right)\end{align}\] Similar for \(\color{darkblue}M_{2,i,j}\). Instead of repreating these steps for \(\color{darkblue}M_{2,i,j}\) we can also simply use a symmetry argument.
  • It is noted that the inequalities \[\begin{align}& \color{darkred}d_{i,j} \ge \color{darkred}x_i - \color{darkred}x_j && \forall i \lt j \\ & \color{darkred}d_{i,j}\ge \color{darkred}x_j-\color{darkred}x_i  && \forall i \lt j \end{align}\] are dropped from the model as the objective is pushing \(\color{darkred}d_{i,j}\) upwards.
  • We can use the same fixing rule as before.
  • The fixing, whether done manually or by the presolver, will actually get rid of the larger big-M's, corresponding to non-overlapping intervals. Here are the statistics for our example data set:
    • Maximum \(\color{darkblue}M\) before fixing: 166.902
    • Maximum \(\color{darkblue}M\) after fixing: 49.081  
    This means fixing will not only make the model smaller, but also numerically easier to solve.
  • If we don't have good bounds (well, we do), we can use SOS1 variables or indicator variables. Below is a SOS1 based model:


MIP Model with SOS1 sets
\[\begin{align} \max\>& \color{darkred}z \\ & \color{darkred}z \le \color{darkred}d_{i,j} && \forall i \lt j \\ & \color{darkred}d_{i,j} = \color{darkred}x_i - \color{darkred}x_j + \color{darkred}v_{i,j,1} && \forall i \lt j \\ & \color{darkred}d_{i,j} = \color{darkred}x_j-\color{darkred}x_i + \color{darkred}v_{i,j,2} && \forall i \lt j \\ &\color{darkred}v_{i,j,1}, \color{darkred}v_{i,j,2} \in {\bf SOS1} && \forall i \lt j\\ & \color{darkred}x_i \in [\color{darkblue}\ell_i, \color{darkblue}u_i] \\ & \color{darkred}v_{i,j,k}\ge 0 \end{align}\]


We can also fix \(\color{darkred}v_{i,j,k}=0\) where appropriate.

Results


The results look like:

----    202 PARAMETER results  

                 MINLP    MINLP/FX     MIP/BIN  MIP/BIN/FX     MIP/SOS  MIP/SOS/FX

points          50.000      50.000      50.000      50.000      50.000      50.000
vars          1276.000    1276.000    2501.000    2501.000    3726.000    3726.000
  discr       1225.000    1225.000    1225.000    1225.000
  fixed                   1022.000                1022.000                1022.000
equs          1225.000    1225.000    3675.000    3675.000    3675.000    3675.000
status       TimeLimit   TimeLimit     Optimal     Optimal   TimeLimit   TimeLimit
obj              1.130       1.130       1.130       1.130       1.130       1.130
time          1000.000    1000.010       4.125       4.375    1010.765    1000.187
nodes          423.000     413.000   17569.000   17569.000 1.869415E+7 2.202626E+7
iterations                           59834.000   59834.000 4.883251E+7 3.922582E+7
gap%             8.362       8.362                               4.082       0.609


All models find the optimal solution of 1.130. But only the MIP model with binary variables was able to prove this within the allotted time of 1,000 seconds. The fixing does not make much difference except in the case of SOS1 sets. Why this is the case, I don't know. 

For the MIP model, indeed Cplex recognizes it can remove a lot of binary variables. The logs indicate this. For both versions (without and with fixing), Cplex shows:

Reduced MIP has 203 binaries, 0 generals, 0 SOSs, and 0 indicators.

This is the number of binary variables we expect to see after removing all fixed binary variables. For the SOS1 models we see:

Reduced MIP has 0 binaries, 0 generals, 203 SOSs, and 0 indicators.

All in all, the MIP model solves very fast.



This picture has two representations of the solution. At the bottom, we see an optimal arrangement of points. ("An optimal arrangement", as the solution is likely not unique). Next, we also show the points in relation to their allowed intervals. Note that we only maximize the minimum distance between two neighbors. That means we don't directly care about longer distances (of course indirectly we do: if we can make these large distances smaller, we may have more room to spread things out).  It is interesting to see that a lot of points are precisely at the minimum distance.


Genetic algorithm


A quick experiment using the ga function in R leads to poor results. Using default settings I find a best objective of  0.26 which is way below the optimal value of 1.130. With non-default settings, I was able to improve this to 0.63. The advantage is that we can use a very simple objective. An elegant, vectorized implementation of the objective (to be maximized) is:

     min(diff(sort(x)))

But these results indicate this method may have trouble beating the MIP model.

> #default
> set.seed(12345)
> res <- ga(type="real-valued",
+           fitness=obj,
+           lower=df$lo,
+           upper=df$up,
+           monitor=T)
GA | iter = 1 | Mean = 0.02821910 | Best = 0.09633924
GA | iter = 2 | Mean = 0.03786265 | Best = 0.10989994
GA | iter = 3 | Mean = 0.03528131 | Best = 0.13367077
GA | iter = 4 | Mean = 0.04404433 | Best = 0.13367077
GA | iter = 5 | Mean = 0.04248061 | Best = 0.14047471
GA | iter = 6 | Mean = 0.04260766 | Best = 0.14047471
GA | iter = 7 | Mean = 0.04794552 | Best = 0.14047471
GA | iter = 8 | Mean = 0.04760054 | Best = 0.14047471
GA | iter = 9 | Mean = 0.04103926 | Best = 0.14047471
  . . . 
GA | iter = 95 | Mean = 0.2227525 | Best = 0.2363474
GA | iter = 96 | Mean = 0.2302461 | Best = 0.2363474
GA | iter = 97 | Mean = 0.2318391 | Best = 0.2527717
GA | iter = 98 | Mean = 0.2226839 | Best = 0.2527717
GA | iter = 99 | Mean = 0.2346336 | Best = 0.2584418
GA | iter = 100 | Mean = 0.2238337 | Best = 0.2584418


Conclusion


This is a fascinating little model. We can attack this in different ways. A linearized MIP model using binary variables seems the best of things I tried. It looks like that spending effort to fix as many binary variables as possible, may not be needed for a good MIP solver. The presolver will take care of that. Presolvers (in advanced solvers) are very complex pieces of machinery. This example shows what they are capable of. Nevertheless, spending a bit of time on the MIP formulation has a big pay-off. With careful linearization and also carefully choosing the big-M constants, we can bring down the solution time for 50 points to 4 seconds. The derivation of optimal values for the big-M's is quite elegant.

The SOS1 formulation is easier: we don't need to worry about big-M's. However, provided we have good, tight big-M's, the binary variable formulation can often perform much better. This is a good example of this observation.

A meta-heuristic, in our case a genetic algorithm, has trouble finding really good, close-to-optimal solutions. I suspect the form of the objective (maximize the minimum) has something to do with this. In the optimal solution, many points are tight: they all are exactly \(\color{darkred}z^*\) removed from a neighbor. A simple, elegant, vectorized R implementation of the objective is used. 

This is a simple problem, but, as often, there is more to say about it than you would think at first. 

Updates


This post has been updated with (substantial) improvements by Rob Pratt and Paul Rubin. 

References


  1. Given n points where each point has its own range, adjust all points to maximize the minimum distance of adjacent points, https://stackoverflow.com/questions/68180974/given-n-points-where-each-point-has-its-own-range-adjust-all-points-to-maximize
  2. Median, quantiles and quantile regression as linear programming problems, https://yetanothermathprogrammingconsultant.blogspot.com/2021/06/median-quantiles-and-quantile.html

Appendix 1. GAMS models


Notes:
  • The first model is denoted as an MINLP model. It could also be declared as a MIQCP model (e.g. when solving with Gurobi - you also need an option file with the nonconvex option).

$ontext


  
Arrange points on a line such that minimum distance between points
  
is maximized. Each point has a valid interval where it can be placed.
  
These intervals may overlap.

  
Random data set with 50 points.

  
Implement three different models:
    
1. MINLP model
    
2. MIP model using binary variables
    
3. MIP model using SOS1 variables

  
All models are solved in two ways (w/o and with fixing)

  
Some improvements in the MIP model were proposed by Rob Pratt.

  
https://yetanothermathprogrammingconsultant.blogspot.com/2021/06/arranging-points-on-line.html

  
erwin@amsterdamoptimization.com


$offtext

*--------------------------------------------------
* data
*--------------------------------------------------

set
 i
'points' /i1*i50/
 b
'bounds' /lo,up/
;

parameter bounds(i,b) 'ranges';
bounds(i,
'lo') = uniform(0,80);
bounds(i,
'up') = bounds(i,'lo') + uniform(0,15);

option bounds:3:0:6;
display bounds;

*-------------------------------------------------------
* reporting macros
*-------------------------------------------------------

acronym TimeLimit;
acronym Optimal;
acronym Error;

parameter results(*,*);
$macro report(m,label,isfixed)  \
    results(
'points',label) = card(i); \
    results(
'vars',label) = m.numVar; \
    results(
'  discr',label) = m.numDVar; \
    results(
'  fixed',label)$isfixed  = card(fix0)+card(fix1); \
    results(
'equs',label) = m.numEqu; \
    results(
'status',label) = Error; \
    results(
'status',label)$(m.solvestat=1) = Optimal; \
    results(
'status',label)$(m.solvestat=3) = TimeLimit; \
    results(
'obj',label) = z.l; \
    results(
'time',label) = m.resusd; \
    results(
'nodes',label) = m.nodusd; \
    results(
'iterations',label) = m.iterusd; \
    results(
'gap%',label)$(m.solvestat=3) = 100*abs(m.objest - m.objval)/abs(m.objest);


*--------------------------------------------------
* MINLP model
*--------------------------------------------------

alias(i,j);
variable
   x(i)        
'location'
   delta(i,j)  
'binary variable'
   z
;
binary variable delta;

x.lo(i) = bounds(i,
'lo');
x.up(i) = bounds(i,
'up');
z.lo = 0;

* what can we fix?
sets
   gt(i,j)
   fix0(i,j)
   fix1(i,j)
;
gt(i,j) =
ord(i)>ord(j);
fix0(gt(i,j)) = bounds(j,
'lo')>bounds(i,'up');
fix1(gt(i,j)) = bounds(i,
'lo')>bounds(j,'up');


parameter fixed(*) 'statistics on fixed variables';
fixed(
'total') = card(gt);
fixed(
'fixed(0)') = card(fix0);
fixed(
'fixed(1)') = card(fix1);
fixed(
'unfixed') = card(gt)-card(fix0)-card(fix1);
option fixed:0;
display fixed;

equations dist(i,j);

dist(gt(i,j)).. z =l= delta(i,j)*(x(i)-x(j)) + (1-delta(i,j))*(x(j)-x(i));

model m1 /dist/;
option optcr=0, minlp=baron, threads=8, reslim=1000;
solve m1 maximizing z using minlp;

report(m1,
'MINLP',0)
display results;


* remove solution
delta.l(gt) = 0;
x.l(i) = 0;

* and fix viariables
delta.fx(fix0) = 0;
delta.fx(fix1) = 1;

solve m1 maximizing z using minlp;

report(m1,
'MINLP/FX',1)
display results;

*unfix for next model
delta.lo(gt) = 0;
delta.up(gt) = 1;


*--------------------------------------------------
* MIP model
*--------------------------------------------------

parameter M(*,i,j) 'big-M';
M(
'1',gt(i,j)) = 2*(bounds(j,'up')-bounds(i,'lo'));
M(
'2',gt(i,j)) = 2*(bounds(i,'up')-bounds(j,'lo'));

positive variable d(i,j) 'absolute distance between points i and j';

equations
   dist1(i,j)
'not needed due to objective'
   dist2(i,j)
'not needed due to objective'
   dist3(i,j)
'one of dist3/dist4 is active'
   dist4(i,j)
   smallest(i,j)
'bound on d(i,j)'
;

dist1(gt(i,j)).. d(i,j) =g= x(i)-x(j);
dist2(gt(i,j)).. d(i,j) =g= x(j)-x(i);
dist3(gt(i,j)).. d(i,j) =l= x(i)-x(j) + M(
'1',i,j)*(1-delta(i,j));
dist4(gt(i,j)).. d(i,j) =l= x(j)-x(i) + M(
'2',i,j)*delta(i,j);
smallest(gt(i,j)).. z =l= d(i,j);

* note we skip dist1 and dist2
model m2 /dist3,dist4,smallest/;
solve m2 maximizing z using mip;

report(m2,
'MIP/BIN',0)
display results;

* fix variables
delta.fx(fix0) = 0;
delta.fx(fix1) = 1;

solve m2 maximizing z using mip;

report(m2,
'MIP/BIN/FX',1)
display results;


*--------------------------------------------------
* MIP model using SOS1 variables
*--------------------------------------------------

set k /case1, case2/;

sos1 variable v(i,j,k);

equations
  distA(i,j), distB(i,j)
;


distA(gt(i,j)).. d(i,j) =e= x(i)-x(j) + v(i,j,
'case1');
distB(gt(i,j)).. d(i,j) =e= x(j)-x(i) + v(i,j,
'case2');

model m3 /distA,distB,smallest/;
solve m3 maximizing z using mip;

report(m3,
'MIP/SOS',0)
display results;

* fix variables
v.fx(fix0,
'case2') = 0;
v.fx(fix1,
'case1') = 0;

solve m3 maximizing z using mip;

report(m3,
'MIP/SOS/FX',1)
display results;


Appendix 2: R code for GA model


This code benefitted from comments by Paul Rubin.

For testing purposes, the optimal MIP solution is included in the data. That is done to make sure we can evaluate the objective for this \(x\) (and make sure our optimal MIP value is returned). 


#
# Try GA on our problem
#
# this data contains the optimal MIP solution
# so we can test our objective function

df <- read.table(text="
id         lo          up           x   
i1     13.73977    23.65636    19.08758 
i2     67.46134    78.79866    72.05679 
i3     44.03003    53.44174    49.68838 
i4     24.09103    28.34900    26.25486 
i5     23.37697    24.67334    23.99505 
i6     17.92423    19.46195    17.95768 
i7     27.98644    37.60521    34.16419 
i8     68.50163    76.68127    70.92689 
i9      5.36910     5.84197     5.36910 
i10    40.01685    51.90226    45.16877 
i11    79.84941    80.94092    80.42055 
i12    46.29867    48.93359    46.29867 
i13    79.29064    87.17513    79.29064 
i14    60.98004    72.23315    63.85675 
i15    10.45540    13.12725    12.30816 
i16    51.17750    51.68962    51.17750 
i17    12.76143    21.53840    16.82778 
i18    20.00644    29.32489    28.51467 
i19    53.51429    59.35472    58.94743     
i20    34.82851    40.20922    39.06089     
i21    28.77602    32.42154    29.64457     
i22    28.11531    31.81163    30.77448     
i23    10.51933    12.47687    11.09919     
i24    12.00814    26.00989    15.69787 
i25    47.12909    52.82816    47.42857 
i26    66.47142    78.22243    66.47142
i27    18.46526    22.96577    20.21749 
i28    53.25876    55.14101    54.76192 
i29    62.06861    73.30172    62.72684 
i30    24.29268    25.33117    25.12495 
i31     8.83938    11.86962     8.83938 
i32    40.19079    40.26678    40.19079 
i33    12.81382    16.85802    13.43806 
i34    69.79698    77.29476    69.79698 
i35    21.20916    23.47845    21.34739 
i36    22.86515    25.47769    22.86515 
i37    47.51647    52.47604    48.55848 
i38    57.81753    62.57112    57.81753 
i39    50.25989    55.09120    53.43731 
i40    37.10383    51.56348    37.10383 
i41    33.06456    47.96859    35.97392 
i42     9.41563    14.96417     9.96929 
i43    25.13698    30.73031    27.38476 
i44     3.72412    15.30380     3.72412 
i45    27.08402    33.03428    33.03428 
i46    14.56797    28.26441    14.56797 
i47    51.65817    53.45184    52.30740 
i48    44.85964    55.89183    55.89183 
i49    61.59694    62.42821    61.59694 
i50    23.82447    32.46897    31.90438 
",header=T)

# sort to simplify things 
obj <- function(x) {
  min(diff(sort(x)))
}

# test with optimal solution
obj(df$x)
# [1] 1.1299


library(GA);

#default
set.seed(12345)
res <- ga(type="real-valued",
          fitness=obj,
          lower=df$lo,
          upper=df$up,
          monitor=T)
# obj = 0.26

set.seed(12345)
res <- gaisl(type="real-valued",
          fitness=obj,
          lower=df$lo,
          upper=df$up,
          maxiter=1000,
          popSize=200,
          monitor=T)
# obj = 0.63

5 comments:

  1. Depending on the MIP solver, you might get better solve times by omitting the d[i,j] >= constraints (which will automatically be satisfied at optimality) and/or by tightening the big-M values:
    D[i,j] - X[i] + X[j] <= 2 * (X[j].ub - X[i].lb) * (1 - Delta[i,j]);
    D[i,j] - X[j] + X[i] <= 2 * (X[i].ub - X[j].lb) * Delta[i,j];
    To derive the big-M for the first constraint, note that if Delta[i,j] = 0 then the second constraint implies that D[i,j] <= X[j] - X[i] <= X[j].ub - X[i].lb, so D[i,j] - X[i] + X[j] <= (X[j].ub - X[i].lb) - X[i].lb + X[j].ub = 2 * (X[j].ub - X[i].lb).
    The derivation of the second big-M is similar.

    ReplyDelete
    Replies
    1. Thanks. Good points. Re the second point. It looks like Cplex removes binary variables just like my fixing approach. That means it only works on cases where intervals overlap. That is good: it will remove binary variables with possibly large big-M's.

      Delete
    2. Actually after incorporating these updates the MIP went from 100 seconds to 4 seconds!

      Delete
  2. I tested your R code a bit. Not surprisingly, if the GA is allowed to run longer, if the population size is increased, or if you do both of those *and* use the "island" GA model (which is a bit less prone to premature convergence, I think), the GA gets better results ... but still not optimal. All told, I agree that the GA is unlikely to be competitive with the MIP model when the number of dimensions is around 50. I also tested three variants of the objective function (outside the GA): yours (nested for loops); a version replacing the loops with an outer product operator (which is still looping internally, but in compiled C code); and one that sorts first and computes consecutive differences. The last one is much faster at higher dimensions. (The outer product version outperforms explicit loops but is not competitive with sorting.) So I wonder whether a high problem dimension would slow down the MIP solver enough to make a GA competitive?

    ReplyDelete
    Replies
    1. Agreed: my code was totally unoptimized. I still have no good feeling for when GA is a good choice. I guess here all the x(i) variables are highly inter-dependent. Also on very large problems, a MIP solver may find good solutions quite quickly (without hope for proving optimality). Good MIP solvers may run something like GA internally as heuristic.

      Delete