This is not, per se, very useful, but sorting a parameter inside a MIP model is not very easy for MIP solvers. Obviously, if you need a sorted parameter in the model, it is better to use a sorting algorithm. But useless models can still be interesting.
I use:
Input: a 1-dimensional parameter with values.
Output: a 1-dimensional variable with the values sorted in ascending order.
We can implement this with a permutation matrix \(\color{darkred}X\), which is a permuted identity matrix. In a MIP context, this becomes a binary variable with some assignment constraints.
MIP Model for sorting \(p_i\) |
---|
\[\begin{align}\min\>&\color{darkred}z=0\\& \sum_i \color{darkred}x_{i,j} = 1&&\forall j\\ & \sum_j \color{darkred}x_{i,j} = 1&&\forall i\\ & \color{darkred}y_i = \sum_j \color{darkred}x_{i,j}\cdot\color{darkblue}p_j\\& \color{darkred}y_i \ge \color{darkred}y_{i-1}\\ & \color{darkred}x_{i,j} \in \{0,1\}\end{align}\] |
If we want to sort \(n\) values, this model has \(n^2\) binary variables.
One possible improvement is to add the extra constraint \[\sum_i \color{darkred}y_i = \sum_i \color{darkblue}p_i\]
Let's compare a commercial solver (Cplex) with an open-source one (HiGHS). I use here \(n=50,100,200\). Model 1 is without the extra constraint while model 2 includes the extra constraint. The data \(\color{darkblue}p_i\) is random. A time limit of 1,000 seconds was used. Otherwise, the solvers used default settings.
---- 96 PARAMETER results solver performance status time nodes iterations cplex.n=50 .model1 Optimal 2.970000E-1 2408 cplex.n=50 .model2 Optimal 2.500000E-1 2151 cplex.n=100.model1 Optimal 4 14055 cplex.n=100.model2 Optimal 2 7092 cplex.n=200.model1 Optimal 227 3043 1684124 cplex.n=200.model2 Optimal 18 10155 highs.n=50 .model1 Optimal 2 highs.n=50 .model2 Optimal 2 highs.n=100.model1 TimeLimit 1001 9587 4955336 highs.n=100.model2 TimeLimit 1000 15221 5373228
- Cplex is much better on the larger models
- HiGHS is very promising on the \(n=50\) model: it solves this in the presolver (zero nodes, zero LP iterations). But it runs out of steam for large models. I would expect that if the presolver can solve the problem for \(n=50\), it would also be able to do this for \(n=100\). As usual, my predictions are not that good.
- The extra constraint is useful for Cplex, especially for the larger models. Both in terms of time and LP iterations.
Numerical issues encountered:Dual failures : 831 out of 5345 (ratio: 0.1555)Singular bases : 1 out of 50821 (ratio: 0.0000)
Sorting a variable
- Non-convex Quadratic: \[ \color{darkred}y_i = \sum_j \color{darkred}x_{i,j}\cdot\color{darkred}p_j\] You need a global quadratic solver for this.
- Linearization using binary variables: the general case. Assume \(\color{darkred}p_j\in [\color{darkblue}\ell_j,\color{darkblue}u_j]\).We can linearize the products \(\color{darkred}w_{i,j}=\color{darkred}x_{i,j}\cdot\color{darkred}p_j\) as follows: \[\begin{align} & \color{darkred}y_i = \sum_j \color{darkred}w_{i,j} \\ & \color{darkblue}\ell_j \cdot \color{darkred}x_{i,j} \le \color{darkred}w_{i,j} \le \color{darkblue}u_j \cdot \color{darkred}x_{i,j}\\ & \color{darkred}p_j - \color{darkblue}u_j\cdot (1-\color{darkred}x_{i,j}) \le \color{darkred}w_{i,j} \le \color{darkred}p_j - \color{darkblue}\ell_j\cdot (1-\color{darkred}x_{i,j}) \\ & \color{darkred}w_{i,j} \in [\min\{0,\color{darkblue}\ell_j\},\max \{0,\color{darkblue}u_j\}]\end{align}\]
- If \(\color{darkred}p_j\in [0,\color{darkblue}u_j]\), things become a bit simpler: \[\begin{align} & \color{darkred}y_i = \sum_j \color{darkred}w_{i,j} \\ & \color{darkred}w_{i,j} \le \color{darkblue}u_j \cdot \color{darkred}x_{i,j}\\ & \color{darkred}p_j - \color{darkblue}u_j\cdot (1-\color{darkred}x_{i,j}) \le \color{darkred}w_{i,j} \le \color{darkred}p_j \\ & \color{darkred}w_{i,j} \in [0,\color{darkblue}u_j]\end{align}\]
- Using indicator constraints, we can write: \[\begin{align}& \color{darkred}y_i = \sum_j \color{darkred}w_{i,j} \\ & \color{darkred}x_{i,j}=0 \implies \color{darkred}w_{i,j}=0 \\ & \color{darkred}x_{i,j}=1 \implies \color{darkred}w_{i,j}=\color{darkred}p_j \end{align}\]
- We can also use a SOS1 formulation. \[\begin{align}& \color{darkred}y_i = \sum_j \color{darkred}w_{i,j} \\ & \color{darkred}x_{i,j}+\color{darkred}s^1_{i,j}=1\\ & \color{darkred}w_{i,j}+\color{darkred}s^2_{i,j}=0 \\ & \color{darkred}w_{i,j}+\color{darkred}s^3_{i,j}=\color{darkred}p_j \\ & (\color{darkred}s^1_{i,j}, \color{darkred}s^2_{i,j}) \in SOS1 \\ & (\color{darkred}x_{i,j},\color{darkred}s^3_{i,j})\in SOS1 \end{align}\] The slack variables are free. This formulation has a lot of SOS1 sets, each one having two members.
---- 203 PARAMETER results solver performance status obj time nodes iterations Baron.quad Optimal -510 876 1 Baron.quad+extra Optimal -510 1309 1 Cplex.bin Optimal -510 1307 53114 21599612 Cplex.bin+extra Optimal -510 691 49318 15245649 Cplex.indic Optimal -510 1098 1170199 69241140 Cplex.indic+extra Optimal -510 144 209764 7843900 Cplex.sos1 TimeLimit -414 3611 959601 15280454 Cplex.sos1+extra Optimal -510 3362 2149403 51270681
We explore the same model but with a non-constant objective function. That is: Max z = sum(y_i * i). The impact on solve time for the HiGHS solver is dramatic. See https://www.solvermax.com/blog/objectives-matter-sorting-using-a-mip-model
ReplyDeleteYou can sort using LP only. See for instance this stackexchange entry: https://cs.stackexchange.com/questions/4805/sorting-as-a-linear-program
ReplyDeleteBut what if the values to sort are endogenous?
DeleteNot sure, but maybe sorting networks https://en.m.wikipedia.org/wiki/Sorting_network might be of interest in that context as well.
ReplyDeleteHere is some example code, how to sort 5 expressions (not parameters)
Delete```py
import mip # pip install python-mip
def sorted(model: mip.Model, *expressions: mip.LinExpr|mip.Var) -> list[mip.LinExpr]:
"sort up linear expressions in ascending order using [sorting networks](https://bertdobbelaere.github.io/sorting_networks.html)"
sorting_network = [
[(0,1)],
[(0,2),(0,1),(1,2)],
[(0,2),(1,3),(0,1),(2,3),(1,2)],
[(0,3),(1,4),(0,2),(1,3),(0,1),(2,4),(1,2),(3,4),(2,3)],
# ...
]
if len(expressions) - 2 >= len(sorting_network):
raise ValueError(f"sorting of {len(expressions)} expressions is not supported")
def sort(a: mip.LinExpr|mip.Var, b: mip.LinExpr|mip.Var) -> list[mip.LinExpr]:
pos, neg = model.add_var(lb=0), model.add_var(lb=0)
model.add_constr(a - b == pos - neg)
model.add_sos([(pos, 1), (neg, 2)], sos_type=1) # might not be needed
return [
0.5 * (a + b + pos + neg),
0.5 * (a + b - pos - neg),
]
result = list(expressions)
if len(expressions) >= 2:
for a, b in sorting_network[len(expressions) - 2]:
result[a], result[b] = sort(result[a], result[b])
return result
model = mip.Model()
n = 5
x = model.add_var_tensor((n,), "x")
sorted_x = sorted(model, *x)
for i, item in enumerate(sorted_x):
print(f"x_({i}) = {item}")
print("s.t.")
for constr in model.constrs:
print(constr)
```