Monday, January 4, 2021

Continuous Facility Location: Symmetry, SOCP and grid search

In this post, I consider the following question: 

Given \(m\) consumer locations, what is the minimum number of facilities (warehouses)  needed, \(n\), so each consumer is within \(K\) kilometers of a warehouse. 

A second, related question is:

Where should we place these \(n\) warehouses? 

I looked at this earlier in [1]. I want to add some new experiments: 

  • Do something about symmetry: breaking symmetry by simple ordering constraints. These can be highly effective (but can also disappoint).
  • See if I can formulate the "min sum distance" model as a convex SOCP (second-order cone programming) problem.
  • Compare with a discrete location problem: place facilities on a grid. This makes the model very large but linear. How does it compare to the quadratic models that place facilities anywhere?

Generate consumer locations.


I just use random locations drawn from a uniform distribution: \[\color{darkblue}p_{i,c} \sim U(0,\color{darkblue}S)\] where \(c \in \{x,y\}\) are coordinates and \(\color{darkblue}S=100\). Of course, this is not very realistic. 

Q: What would be a good way to generate more realistic locations? Typically, locations are clustered a bit. Maybe: uniform locations with some random count of consumers at each location.   

For my experiments, I used a dataset \(m=50\) consumer locations. This is both small enough to make the models solve quickly and large enough to show differences in performance.

  

----     37 PARAMETER p  points

              x           y

p1       17.175      84.327
p2       55.038      30.114
p3       29.221      22.405
p4       34.983      85.627
p5        6.711      50.021
p6       99.812      57.873
p7       99.113      76.225
p8       13.069      63.972
p9       15.952      25.008
p10      66.893      43.536
p11      35.970      35.144
p12      13.149      15.010
p13      58.911      83.089
p14      23.082      66.573
p15      77.586      30.366
p16      11.049      50.238
p17      16.017      87.246
p18      26.511      28.581
p19      59.396      72.272
p20      62.825      46.380
p21      41.331      11.770
p22      31.421       4.655
p23      33.855      18.210
p24      64.573      56.075
p25      76.996      29.781
p26      66.111      75.582
p27      62.745      28.386
p28       8.642      10.251
p29      64.125      54.531
p30       3.152      79.236
p31       7.277      17.566
p32      52.563      75.021
p33      17.812       3.414
p34      58.513      62.123
p35      38.936      35.871
p36      24.303      24.642
p37      13.050      93.345
p38      37.994      78.340
p39      30.003      12.548
p40      74.887       6.923
p41      20.202       0.507
p42      26.961      49.985
p43      15.129      17.417
p44      33.064      31.691
p45      32.209      96.398
p46      99.360      36.990
p47      37.289      77.198
p48      39.668      91.310
p49      11.958      73.548
p50       5.542      57.630





Below I will discuss some models that can be used to attack this problem. 


Model 1: Minimize the number of facilities 


The task of this model is just to find the number of facilities \(n\) that is needed to obey the maximum distance constraint. The model will actually model possible locations for the facilities, but these are tentative. They are not optimal. That is: there are locations that are closer to the customers.

In the following, \(i\) indicates the consumer locations, and \(j\) refers to warehouses (facilities). The number of possible warehouses we can use is encoded in the set \(j \in J\). It should be larger than the number we expect, but not crazy large as this makes the problem much larger. If you choose it too small, the problem will be infeasible.


MIQCP model 1: min num facilities
Data\[\begin{align}&i &&\text{consumers, customers}\\ & j &&\text{facilities, warehouses (overestimate number)} \\ & c=\{x,y\} && \text{use 2D coordinates}\\ & \color{darkblue}p_{i,c} && \text{locations of customers} \\ & \color{darkblue}K && \text{maximum distance} \\ & \color{darkblue} S = 100 && \text{side of square we are considering} \\ & \color{darkblue}M_1= 2\color{darkblue}S^2 - \color{darkblue}K^2&& \text{big-M value}\end{align}\]
Variables\[\begin{align} &\color{darkred}w_{j,c} \in [0,\color{darkblue}S] && \text{location of facility}\\ &\color{darkred}a_{i,j} \in \{0,1\} && \text{assignment of consumer to facility}\\ & \color{darkred}u_j \in \{0,1\} && \text{facility is used} \\ & \color{darkred} n && \text{number of facilities needed}\end{align}\]
Equations\[\begin{align}\min\>&\color{darkred}n = \sum_j \color{darkred} u_j \\ & \sum_c \left( \color{darkblue}p_{i,c} - \color{darkred}w_{j,c}\right)^2 \le \color{darkblue}K^2 + \color{darkblue}M_1 (1- \color{darkred}a_{i,j}) && \forall i,j && \text{distance}\\ & \sum_j \color{darkred}a_{i,j} = 1 && \forall i && \text{assignment}\\ & \color{darkred}a_{i,j} \le \color{darkred}u_j && \forall i,j && \text{closed facility} \\ & \color{darkred}u_j \ge \color{darkred}u_{j+1} && && \text{ordering (optional)} \\ & \color{darkred}w_{j,x} \le \color{darkred}w_{j+1,x}&& && \text{ordering (optional)} \end{align} \]


We restricted the search area to \([0,\color{darkblue}S]^2\). Assuming facilities will be placed within the convex hull of customer points, we could as well require: \[ \min_i \color{darkblue}p_{i,c}\le \color{darkred}w_{j,c} \le \max_i \color{darkblue}p_{i,c}\] For simplicity, I chose \(\color{darkred}w_{j,c} \in [0,\color{darkblue}S]\). 

The distance equation says: for consumer \(i\) there is (at least) one facility \(j\) that is closer than \(K\) km. We do this by assigning a warehouse to each consumer and imposing the distance constraint for that assigned pair. A general nonlinear distance constraint could look like: \[\color{darkred}a_{i,j}=1 \Rightarrow \sqrt{\sum_c \left( \color{darkblue}p_{i,c} - \color{darkred}w_{j,c}\right)^2} \le \color{darkblue} K\] This could be implemented as a quadratic indicator constraint: \[\color{darkred}a_{i,j}=1 \Rightarrow \sum_c \left( \color{darkblue}p_{i,c} - \color{darkred}w_{j,c}\right)^2 \le \color{darkblue} K^2\] Interestingly, solvers that support indicator constraints, seem to allow only linear constraints on the right. (Why? I don't know) Here we have a quadratic constraint, so we have to reformulate this as a big-M constraint \[ \sum_c \left( \color{darkblue}p_{i,c} - \color{darkred}w_{j,c}\right)^2 \le \color{darkblue}K^2 +\color{darkblue}M_1 (1-\color{darkred}a_{i,j})\] A conservative bound gives us the following value \(\color{darkblue}M_1  = 2\color{darkblue}S^2 - \color{darkblue}K^2\).

This is a convex quadratic constraint, so we can use well-known solvers for this.  

We have two ordering constraints. The first one says: the used warehouses are the first ones. I.e. \(u_j\) will look like \([1,1,1,0,0,0,0]\). This actually helped me in the implementation of the subsequent models. The first \(n\) should be reused in those models. But it also imposes an ordering that can help against a lot of symmetry. To further reduce symmetry we also order the facilities by their \(x\)-coordinate. 

The addition of these symmetry-breaking constraints is not a guaranteed success. We make the search space much smaller, so this is a big advantage. But we also may make finding a feasible solution much more difficult. Before doing some experimentation, it is very hard to predict how these trade-offs will play out.

The solution can look like:



We see we need 3 facilities, but the locations and the assignments are not optimal. We will use a second model to address this.


Model 2: Minimize the sum of the squared distances


In this model, we will try to optimize the locations of the \(n\) facilities (and the assignment of customers to their closest warehouse) by minimizing the sum of the squared distances. We use the optimal number of facilities needed from model 1. This also reduces the size of set \(j\). 

The model can be stated as:

MIQCP model 2: min sum squared distances
Data\[\begin{align}& \color{blue}n && \text{number of facilities needed}\\ &i &&\text{consumers, customers}\\ & j = \{1,\dots,\color{darkblue}n\}&&\text{facilities, warehouses} \\ & c=\{x,y\} && \text{use 2D coordinates}\\ & \color{darkblue}p_{i,c} && \text{locations of customers} \\ & \color{darkblue}K && \text{maximum distance} \\ & \color{darkblue}M_2 = 2\color{darkblue}S^2&& \text{big-M value} \end{align}\]
Variables\[\begin{align} &\color{darkred}\Delta_{i,j}\in [0,\color{darkblue}K^2]&& \text{squared distance between assigned facility and customer (or zero)} \\ &\color{darkred}w_{j,c} \in [0,\color{darkblue}S] && \text{location of facility}\\ &\color{darkred}a_{i,j} \in \{0,1\} && \text{assignment of consumer to facility} \end{align}\]
Equations\[\begin{align}\min\>&\sum_{i,j} \color{darkred} \Delta_{i,j} \\ & \color{darkred}\Delta_{i,j} \ge \sum_c \left( \color{darkblue}p_{i,c} - \color{darkred}w_{j,c}\right)^2 - \color{darkblue}M_2 (1- \color{darkred}a_{i,j}) && \forall i,j && \text{squared distance}\\ & \sum_j \color{darkred}a_{i,j} = 1 && \forall i && \text{assignment}\\ & \color{darkred}w_{j,x} \le \color{darkred}w_{j+1,x}&& && \text{ordering (optional)} \end{align} \]


The distance constraint implements the implication: \[\color{darkred}a_{i,j} = 1 \Rightarrow \color{darkred}\Delta_{i,j} \ge \sum_c \left( \color{darkblue}p_{i,c} - \color{darkred}w_{j,c}\right)^2\] Again, indicator constraints are only for linear constraints, so we need a big-M formulation. A conservative value for the big-M constant is \(2\color{darkblue}M_2 = \color{darkblue}S^2\). This will allow \(\color{darkred}\Delta_{i,j}=0\) when \(\color{darkred}a_{i,j} = 0\).

The result can be:



This picture is much better. We see that the location of warehouses and the assignments make much more sense. 


We can look at this as a multi-objective problem where we have the objectives (1) minimize the number of facilities and (2) minimize the sum of (squared) distances. Here we used a lexicographic approach: solve for the first objective and add this as a "constraint" to the second objective problem. In this case, the constraint turns out just a question of dimensioning the second problem.  


Model 3: Minimize the sum of the distances


It looks like converting model 2 from minimizing the sum of squared distances to the sum of distances should be an easy task. However, this is not exactly the case. A first attempt leads to an MINLP model:


MINLP model 3a: min sum distances
Equations\[\begin{align}\min\>&\sum_{i,j} \sqrt{\color{darkred} \Delta_{i,j}} \\ & \color{darkred}\Delta_{i,j} \ge \sum_c \left( \color{darkblue}p_{i,c} - \color{darkred}w_{j,c}\right)^2 - \color{darkblue}M_2 (1- \color{darkred}a_{i,j}) && \forall i,j && \text{squared distance}\\ & \sum_j \color{darkred}a_{i,j} = 1 && \forall i && \text{assignment}\\ & \color{darkred}w_{j,x} \le \color{darkred}w_{j+1,x}&& && \text{ordering (optional)} \end{align} \]


The data and the variables are the same as for model 2. The objective makes this problem a general MINLP model which is more difficult to solve and requires different solvers. 

It is noted that the objective may be problematic. The square root function \(\sqrt{x}\) can not be evaluated for any \(x\lt 0\). Furthermore, the gradient of \(\sqrt{x}\) is not defined at \(x=0\). This makes this a bit dangerous to use in this form. One way to work around this is to use a small constant \(\color{darkblue}\epsilon\gt 0\) and write \[ \min\>\sum_{i,j} \sqrt{\color{darkblue}\epsilon+ \color{darkred}\Delta_{i,j}}\] Another possibility is to change the lower bound a little bit: \(\color{darkred}\Delta_{i,j}\in [\color{darkblue}\epsilon,\color{darkblue}K^2]\).

A quadratic version of the model can look like:


Non-convex MIQCP model 3b: min sum distances
Variables\[\begin{align} &\color{darkred}d_{i,j}\in [0,\color{darkblue}K]&& \text{distance between assigned facility and customer (or zero)} \\ &\color{darkred}w_{j,c} \in [0,\color{darkblue}S] && \text{location of facility}\\ &\color{darkred}a_{i,j} \in \{0,1\} && \text{assignment of consumer to facility} \end{align}\]
Equations\[\begin{align}\min\>&\sum_{i,j} \color{darkred} d_{i,j} \\ & \color{darkred}d^2_{i,j} \ge \sum_c \left( \color{darkblue}p_{i,c} - \color{darkred}w_{j,c}\right)^2 - \color{darkblue}M_2 (1- \color{darkred}a_{i,j}) && \forall i,j && \text{squared distance}\\ & \sum_j \color{darkred}a_{i,j} = 1 && \forall i && \text{assignment}\\ & \color{darkred}w_{j,x} \le \color{darkred}w_{j+1,x}&& && \text{ordering (optional)} \end{align} \]



This is a quadratic model, but the distance constraint is non-convex. Or to be more specific: this is not recognized as being convex by solvers like Cplex and Gurobi. With quite some effort we can shoehorn this into a convex MISOCP (Mixed Integer Second Cone Programming) model. An SOCP constraint [2] looks like: \[||\color{darkblue}A_i \color{darkred}x + \color{darkblue}b_i||_2 \le \color{darkblue}c_i^T \color{darkred}x+\color{darkblue}d_i\] It is a bit difficult to recognize, but this allows for a constraint like \[\begin{align}& \color{darkred}d_{i,j}^2 \ge \sum_c \color{darkred}y_{i,j,c}^2 \\ & \color{darkred}d_{i,j} \ge 0\end{align}\] With this we can write:



MISOCP Model 3c: min sum distances
Variables\[\begin{align} &\color{darkred}z_{i,j}\in [0,\color{darkblue}K]&& \text{distance between assigned facility and customer (or zero)} \\ &\color{darkred}d_{i,j}\in [0,\sqrt{\color{darkblue}M_2}]&& \text{distance between any facility and customer}\\ & \color{darkred} y_{i,j,c} && \text{intermediate free variable} \\ &\color{darkred}w_{j,c} \in [0,\color{darkblue}S] && \text{location of facility}\\ &\color{darkred}a_{i,j} \in \{0,1\} && \text{assignment of consumer to facility} \end{align}\]
Equations\[\begin{align}\min\>&\sum_{i,j} \color{darkred} z_{i,j} \\ & \color{darkred}z_{i,j} \ge \color{darkred}d_{i,j} - \sqrt{\color{darkblue} M_2 }(1-\color{darkred}a_{i,j}) && \forall i,j && \text{distance is zero when not assigned}\\ & \color{darkred}d^2_{i,j} \ge \sum_c \color{darkred}y^2_{i,j,c} && \forall i,j && \text{SOCP constraint}  \\ & \color{darkred}y_{i,j,c} =  \color{darkblue}p_{i,c} - \color{darkred}w_{j,c} && \forall i,j,c && \text{intermediate}\\ & \sum_j \color{darkred}a_{i,j} = 1 && \forall i && \text{assignment}\\ & \color{darkred}w_{j,x} \le \color{darkred}w_{j+1,x}&& && \text{ordering (optional)} \end{align} \]


We needed to add a lot of extra variables and constraints to make this model work for an SOCP solver. 




Model 4: grid search


If we only consider a grid of possible locations for the facilities, things simplify significantly. This now becomes a discrete facility location problem. If we indicate our grid points by \(g\), then we can calculate the distances between customer locations \(\color{darkblue}p_{i,c}\) and possible facility locations \(\color{darkblue}q_{g,c}\) in advance (i.e. outside the optimization model). So these nonlinear expressions are now reduced to just data manipulation: \[\color{darkblue}{\mathit{dist}}_{i,g} = \sqrt{\sum_c \left( \color{darkblue}p_{i,c} - \color{darkblue}q_{g,c}\right)^2}\] This is great, as we can now develop a linear model. We again use two models to capture the multi-objective nature of this problem. 

In the models below we only consider admissible assignments, that is only assignments of customer to warehouse where the distance is less than \(\color{darkblue}K\). We do this by introducing a two dimensional set \(\color{darkblue}{\mathit{ok}}_{i,g}\) with \[\color{darkblue}{\mathit{ok}}_{i,g} = \color{darkblue}{\mathit{dist}}_{i,g} \le \color{darkblue}K\] In the model we only consider \((i,g)\) combinations in the set \(\color{darkblue}{\mathit{ok}}\).
 

MIP Model 4: locations on a grid
Data\[\begin{align}&g&&\text{grid points}\\ &\color{darkblue}{\mathit{dist}}_{i,g}&&\text{distances}\\ &\color{darkblue}{\mathit{ok}}_{i,g}&&\text{admissible assignments} \end{align}\]
Variables\[\begin{align}&\color{darkred}{\mathit{place}}_g \in \{0,1\}&&\text{whether to place a facility at $g$}\\ &\color{darkred}{\mathit{ag}}_{i,g} \in \{0,1\}&&\text{assign customer $i$ to $g$} \end{align}\]
Model 4a\[\begin{align}\min\>&\color{darkred}n\\&\sum_{g|\color{darkblue}{\mathit{ok}}(i,g)} \color{darkred}{\mathit{ag}}_{i,g} = 1 &&\forall i &&\text{assignment} \\ & \color{darkred}{\mathit{ag}}_{i,g} \le \color{darkred}{\mathit{place}}_g &&\forall i,g|\color{darkblue}{\mathit{ok}}(i,g) &&\text{use location}\\ &\color{darkred}n = \sum_g \color{darkred}{\mathit{place}}_g && && \text{number of facilities}\end{align}\]
Model 4b\[\begin{align} \min & \sum_{i,g|\color{darkblue}{\mathit{ok}}(i,g)} \color{darkblue}{\mathit{dist}}_{i,g} \cdot \color{darkred}{\mathit{ag}}_{i,g} && && \text{sum distances} \\ & \sum_{g|\color{darkblue}{\mathit{ok}}(i,g)} \color{darkred}{\mathit{ag}}_{i,g} = 1 &&\forall i &&\text{assignment} \\ & \color{darkred}{\mathit{ag}}_{i,g} \le \color{darkred}{\mathit{place}}_g &&\forall i,g|\color{darkblue}{\mathit{ok}}(i,g) &&\text{use location}\\ & \sum_g \color{darkred}{\mathit{place}}_g = \color{darkblue}n^* && && \text{fix number of facilities} \end{align} \]



Both models 4a and 4b are linear. I set up a grid for \(x\) and \(y\) coordinates: \(1,3,5,\dots,99\). This results in \(50\times50=2500\) grid points.


The pictures look very much the same.

This fine grid leads to a very large model. The number of binary variables is 46,542 for our data set which is very large. The number of equations is similarly large: 44,094. These numbers depend on the grid and on how many admissible assignments with a distance less than \(K=40\) there are.

There is no corresponding ordering constraint for this problem. 

Performance


We have produced a number of models here. Let's see how they behave.

ModelTypeObjectiveSum dist^2Sum distTimeNodesIterations
1MIQCPmin num fac42,1271,38020.2715,487490,331
1 + orderMIQCPmin num fac39,2881,3136.3751,23028,284
2MIQCPmin sum dist^221,105937.849.23255,858976,174
2 + orderMIQCPmin sum dist^221,105937.815.3187,319328,474
3MISOCPmin sum dist21,186935.12,9815,896,36166,465,832
3 + orderMISOCPmin sum dist21,186935.1431.9933,00214,612,297
4aMIPmin num fac38,1341,27331.17021
4bMIPmin sum dist21,313944.715.94017


A lot to digest here. But what we see is:
  • Adding the ordering constraints to help with symmetry does indeed help. Sometimes a lot.
  • The MIP model is super easy to solve. Of course, the problem is reduced to more or less an assignment problem. This is something solvers can do very well. In this case, the model can be solved completely during preprocessing. But note that it is very big.
  • The second-order cone model is much more difficult to solve than the MIQCP model that minimizes the sum of the squared distances. My intuition: too many epigraph-type indirections in the model so the pricing signals from the objective are less obvious. 
 

Conclusions


  • Simple symmetry-breaking constraints that enforce an ordering on the decision variables can make a large difference.
  • Indicator constraints are for linear constraints only. So we had to implement big-M formulations.
  • Formulating SOCPs can require some real effort. Tools (modeling systems, solvers) can do a better job in applying appropriate transformations automatically.
  • Experimentation with different formulations can pay off. Here we see that min sum distance^2 is preferable above min sum distance: much easier to solve and almost the same optimal solution. Not obvious (to me) before implementing and trying both formulations.
  • It may be better to use a grid of candidate locations. This way distances can be precomputed, and we have a linear problem. However, the model can become very big if the grid is too fine.
  • Just a simple question: what is the minimum number of warehouses needed so each customer is within a given distance? Not so simple to answer. There are quite a few angles to explore.

References


1 comment:

  1. I sometimes wonder why solvers don't provide built-in support to compare the performance of different problem formulations. Often, the large number of possible formulations makes it difficult to compare by hand.

    ReplyDelete