This is not, per se, very useful, but sorting a parameter inside a MIP model is not very easy for MIP solvers. Obviously, if you need a sorted parameter in the model, it is better to use a sorting algorithm. But useless models can still be interesting.
I use:
Input: a 1-dimensional parameter with values.
Output: a 1-dimensional variable with the values sorted in ascending order.
We can implement this with a permutation matrix X, which is a permuted identity matrix. In a MIP context, this becomes a binary variable with some assignment constraints.
MIP Model for sorting pi |
---|
minz=0∑ixi,j=1∀j∑jxi,j=1∀iyi=∑jxi,j⋅pjyi≥yi−1xi,j∈{0,1} |
If we want to sort n values, this model has n2 binary variables.
One possible improvement is to add the extra constraint ∑iyi=∑ipi
Let's compare a commercial solver (Cplex) with an open-source one (HiGHS). I use here n=50,100,200. Model 1 is without the extra constraint while model 2 includes the extra constraint. The data pi is random. A time limit of 1,000 seconds was used. Otherwise, the solvers used default settings.
---- 96 PARAMETER results solver performance status time nodes iterations cplex.n=50 .model1 Optimal 2.970000E-1 2408 cplex.n=50 .model2 Optimal 2.500000E-1 2151 cplex.n=100.model1 Optimal 4 14055 cplex.n=100.model2 Optimal 2 7092 cplex.n=200.model1 Optimal 227 3043 1684124 cplex.n=200.model2 Optimal 18 10155 highs.n=50 .model1 Optimal 2 highs.n=50 .model2 Optimal 2 highs.n=100.model1 TimeLimit 1001 9587 4955336 highs.n=100.model2 TimeLimit 1000 15221 5373228
- Cplex is much better on the larger models
- HiGHS is very promising on the n=50 model: it solves this in the presolver (zero nodes, zero LP iterations). But it runs out of steam for large models. I would expect that if the presolver can solve the problem for n=50, it would also be able to do this for n=100. As usual, my predictions are not that good.
- The extra constraint is useful for Cplex, especially for the larger models. Both in terms of time and LP iterations.
Numerical issues encountered:Dual failures : 831 out of 5345 (ratio: 0.1555)Singular bases : 1 out of 50821 (ratio: 0.0000)
Sorting a variable
- Non-convex Quadratic: yi=∑jxi,j⋅pj You need a global quadratic solver for this.
- Linearization using binary variables: the general case. Assume pj∈[ℓj,uj].We can linearize the products wi,j=xi,j⋅pj as follows: yi=∑jwi,jℓj⋅xi,j≤wi,j≤uj⋅xi,jpj−uj⋅(1−xi,j)≤wi,j≤pj−ℓj⋅(1−xi,j)wi,j∈[min{0,ℓj},max{0,uj}]
- If pj∈[0,uj], things become a bit simpler: yi=∑jwi,jwi,j≤uj⋅xi,jpj−uj⋅(1−xi,j)≤wi,j≤pjwi,j∈[0,uj]
- Using indicator constraints, we can write: yi=∑jwi,jxi,j=0⟹wi,j=0xi,j=1⟹wi,j=pj
- We can also use a SOS1 formulation. yi=∑jwi,jxi,j+s1i,j=1wi,j+s2i,j=0wi,j+s3i,j=pj(s1i,j,s2i,j)∈SOS1(xi,j,s3i,j)∈SOS1 The slack variables are free. This formulation has a lot of SOS1 sets, each one having two members.
---- 203 PARAMETER results solver performance status obj time nodes iterations Baron.quad Optimal -510 876 1 Baron.quad+extra Optimal -510 1309 1 Cplex.bin Optimal -510 1307 53114 21599612 Cplex.bin+extra Optimal -510 691 49318 15245649 Cplex.indic Optimal -510 1098 1170199 69241140 Cplex.indic+extra Optimal -510 144 209764 7843900 Cplex.sos1 TimeLimit -414 3611 959601 15280454 Cplex.sos1+extra Optimal -510 3362 2149403 51270681
We explore the same model but with a non-constant objective function. That is: Max z = sum(y_i * i). The impact on solve time for the HiGHS solver is dramatic. See https://www.solvermax.com/blog/objectives-matter-sorting-using-a-mip-model
ReplyDeleteYou can sort using LP only. See for instance this stackexchange entry: https://cs.stackexchange.com/questions/4805/sorting-as-a-linear-program
ReplyDeleteBut what if the values to sort are endogenous?
DeleteNot sure, but maybe sorting networks https://en.m.wikipedia.org/wiki/Sorting_network might be of interest in that context as well.
ReplyDeleteHere is some example code, how to sort 5 expressions (not parameters)
Delete```py
import mip # pip install python-mip
def sorted(model: mip.Model, *expressions: mip.LinExpr|mip.Var) -> list[mip.LinExpr]:
"sort up linear expressions in ascending order using [sorting networks](https://bertdobbelaere.github.io/sorting_networks.html)"
sorting_network = [
[(0,1)],
[(0,2),(0,1),(1,2)],
[(0,2),(1,3),(0,1),(2,3),(1,2)],
[(0,3),(1,4),(0,2),(1,3),(0,1),(2,4),(1,2),(3,4),(2,3)],
# ...
]
if len(expressions) - 2 >= len(sorting_network):
raise ValueError(f"sorting of {len(expressions)} expressions is not supported")
def sort(a: mip.LinExpr|mip.Var, b: mip.LinExpr|mip.Var) -> list[mip.LinExpr]:
pos, neg = model.add_var(lb=0), model.add_var(lb=0)
model.add_constr(a - b == pos - neg)
model.add_sos([(pos, 1), (neg, 2)], sos_type=1) # might not be needed
return [
0.5 * (a + b + pos + neg),
0.5 * (a + b - pos - neg),
]
result = list(expressions)
if len(expressions) >= 2:
for a, b in sorting_network[len(expressions) - 2]:
result[a], result[b] = sort(result[a], result[b])
return result
model = mip.Model()
n = 5
x = model.add_var_tensor((n,), "x")
sorted_x = sorted(model, *x)
for i, item in enumerate(sorted_x):
print(f"x_({i}) = {item}")
print("s.t.")
for constr in model.constrs:
print(constr)
```
I thought about this problem while preparing for my upcoming optimisation seminar (https://instats.org/seminar/decision-optimization-with-classical-and), and I have a suggestion very similar to the one proposed by Christian, i.e. using a sorting network, but instead of special ordered sets I use a single binary variable and two continuous ones for each comparison (inspired by or.stackexchange.com/questions/1160/how-to-linearize-min-function-as-a-constraint).
ReplyDeleteLet X and Y be two endogenous parameters, with bounds [Lx, Ux] and [Ly, Uy], respectively, and define Z to be a binary (0-1) variable constrained by:
𝑌 − 𝑋 ≤ (𝑈𝑦 – 𝐿𝑥)𝑍
𝑋−𝑌 ≤ (𝑈𝑥 – 𝐿𝑦)(1−𝑍)
These constraints ensure that Z = 0 implies X >= Y, whereas Z = 1 implies Y >= X (note that the case of an equality is left ambiguous).
Using this auxiliary variable Z we can define W = min(X,Y) and V = max(X,Y) as follows:
𝑊≤𝑋, 𝑊≤𝑌, 𝑊≥𝑋+(1−𝑍)(𝐿𝑦 −𝑈𝑥), 𝑊≥𝑌+𝑍(𝐿𝑥−𝑈𝑦)
𝑉=𝑋+𝑌−𝑊
The new bounds for W and V become [min(Lx, Ux), min(Ux,Uy)] and [max(Lx, Ux), max(Ux,Uy)], respectively, and the process can be repeated.
This approach ensures that we can repeatedly propagate values through the sorting network, and requires only (N logN) extra variables and constraints, with no products involved at all; it also does not require tampering with the objective function. I should make the disclaimer that I haven't tested my approach yet.
Leonid