Problem Description
This is adapted from [1].
Assume we have data \(a_i\) and \(b_j\):
---- 11 PARAMETER a i1 0.172, i2 0.843, i3 0.550, i4 0.301, i5 0.292, i6 0.224, i7 0.350, i8 0.856, i9 0.067 i10 0.500 ---- 11 PARAMETER b j1 0.998, j2 0.621, j3 0.992, j4 0.786, j5 0.218, j6 0.676, j7 0.244, j8 0.325, j9 0.702 j10 0.492, j11 0.424, j12 0.416, j13 0.218, j14 0.235, j15 0.630
Try to find an assignment of each \(a_i\) to a unique \(b_j\) such that the quantities \[ \frac{a_i}{b_j}\] are as close to each other as possible.
This is more or less an assignment problem. The objective to make all ratios \(a_i/b_j\) as equal as possible, makes it interesting. I'll discuss some formulations for this problem.
We first introduce assignment variables: \[x_{i,j} = \begin{cases} 1 & \text{if $a_i$ is assigned to $b_j$} \\ 0 & \text{otherwise}\end{cases}\] The standard assignment constraints apply \[\begin{align}&\sum_i x_{i,j}\le 1&&\forall j \\&\sum_j x_{i,j} = 1&&\forall i \end{align}\] This is an "unbalanced" assignment: we have more \(j\)'s than \(i\)'s.
To model "all about equal", we introduce a continuous variable \(c\) that represents the common value. I.e. \[\frac{a_i}{b_j} \approx c \>\>\text{for all $(i,j)$ such that $x_{i,j}=1$}\]
The objective is \[\min \sum_{i,j} \left[ x_{i,j} \left( \frac{a_i}{b_j} - c\right) \right]^2 \] or \[\min \sum_{i,j} x_{i,j} \left( \frac{a_i}{b_j} - c\right)^2 \]
A complete model can look like:
When we run this model, using an MINLP solver, we see:
The assignment can be depicted as follows.
In the above matrix the cell values are \[\text{cell}_{i,j} = \frac{a_i}{b_j}\]
We can reformulate this MINLP model into a quadratic model by introducing variables \[y_{i,j} = x_{i,j} c\] This non-linear constraint can be linearized by stating \[\begin{align} & x_{i,j} = 0 \Rightarrow y_{i,j} = 0\\ &x_{i,j} = 1 \Rightarrow y_{i,j} = c\end{align}\] With these \(y\) variables we can form an objective: \[\sum_{i,j} \left( x_{i,j} \frac{a_i}{b_j} - y_{i,j} \right)^2 \] Unfortunately, this is a non-convex objective. There are not many non-convex MIQP solvers around (Cplex has one). This model gives the same solutions as the MINLP model above. I am not showing the results here, as they are identical to the ones in the previous section.
We assumed here positive fractions \(a_i/b_j \gt 0\).This allows us to say \(c\ge 0\) and \(y_{i,j}\ge 0\).
We can approximate the quadratic weighting of the residuals by absolute values: \[\sum_{i,j} \left| x_{i,j} \left( \frac{a_i}{b_j} - c \right) \right| \] The first thing to is to linearize the term \(x_{i,j} c\). As in the quadratic model, we introduce variables \[y_{i,j} = x_{i,j} c\] We can form the implications: \[\begin{align} & x_{i,j} = 0 \Rightarrow y_{i,j} = 0\\ &x_{i,j} = 1 \Rightarrow y_{i,j} = c\end{align}\] Most solvers support indicator constraints, which allows us to implement these implications as is. If we have a solver without this, we can formulate this as a bunch of big-\(M\) constraints: \[\begin{align} & - M x_{i,j} \le y_{i,j} \le M x_{i,j} \\ & c-M(1-x_{i,j}) \le y_{i,j} \le c+M(1-x_{i,j})\end{align}\] From the data we can assume \(a_{i}\ge 0\) and \(b_{j}\gt 0\). We can exploit this: \[\begin{align} & 0 \le y_{i,j} \le M_1 x_{i,j} \\ & c-M_2(1-x_{i,j}) \le y_{i,j} \le c+M_2(1-x_{i,j})\end{align}\] We can think a bit about good values for \(M_1\) and \(M_2\). I suggest: \[M_1 = M_2 = \max c = \max_{i,j} \frac{a_i}{b_j}\]
The complete model can look like
The results look like:
The model results are very close. The assignments are the same. Just the \(c\) value and resulting sum of squared or absolute residuals are somewhat different.
The solution looks like:
A picture of this solution:
The disadvantage of this method is that it does not care about assignments as long as they are inside our optimal bandwidth. We can see this if we recalculate an optimal central \(c\) value afterwards. The results with this reconstructed \(c\) value:
Indeed, we see that the total sum of squared or absolute deviations is not as good as we saw before. The same thing can be seen by just calculating the standard deviation for the selected ratios \(a_i/b_j\). This gives:
This is more or less an assignment problem. The objective to make all ratios \(a_i/b_j\) as equal as possible, makes it interesting. I'll discuss some formulations for this problem.
A first model
We first introduce assignment variables: \[x_{i,j} = \begin{cases} 1 & \text{if $a_i$ is assigned to $b_j$} \\ 0 & \text{otherwise}\end{cases}\] The standard assignment constraints apply \[\begin{align}&\sum_i x_{i,j}\le 1&&\forall j \\&\sum_j x_{i,j} = 1&&\forall i \end{align}\] This is an "unbalanced" assignment: we have more \(j\)'s than \(i\)'s.
To model "all about equal", we introduce a continuous variable \(c\) that represents the common value. I.e. \[\frac{a_i}{b_j} \approx c \>\>\text{for all $(i,j)$ such that $x_{i,j}=1$}\]
The objective is \[\min \sum_{i,j} \left[ x_{i,j} \left( \frac{a_i}{b_j} - c\right) \right]^2 \] or \[\min \sum_{i,j} x_{i,j} \left( \frac{a_i}{b_j} - c\right)^2 \]
A complete model can look like:
MINLP Model |
---|
\[\begin{align} \min & \sum_{i,j} \color{darkred}x_{i,j} \left( \frac{\color{darkblue}a_i}{\color{darkblue}b_j} - \color{darkred}c\right)^2 \\ & \sum_j \color{darkred} x_{i,j} = 1 &&\forall i\\ & \sum_i \color{darkred} x_{i,j} \le 1 &&\forall j\\ & \color{darkred}x_{i,j} \in \{0,1\}\\ & \color{darkred}c \text{ free} \end{align}\] |
When we run this model, using an MINLP solver, we see:
---- 33 VARIABLE x.L assignment j1 j3 j4 j5 j7 j8 j9 j10 j11 j12 i1 1.000 i2 1.000 i3 1.000 i4 1.000 i5 1.000 i6 1.000 i7 1.000 i8 1.000 i9 1.000 i10 1.000 ---- 33 VARIABLE c.L = 0.695 common value VARIABLE z.L = 0.201 objective
The assignment can be depicted as follows.
Solution. Matrix values are a(i)/b(j). |
A quadratic model
We can reformulate this MINLP model into a quadratic model by introducing variables \[y_{i,j} = x_{i,j} c\] This non-linear constraint can be linearized by stating \[\begin{align} & x_{i,j} = 0 \Rightarrow y_{i,j} = 0\\ &x_{i,j} = 1 \Rightarrow y_{i,j} = c\end{align}\] With these \(y\) variables we can form an objective: \[\sum_{i,j} \left( x_{i,j} \frac{a_i}{b_j} - y_{i,j} \right)^2 \] Unfortunately, this is a non-convex objective. There are not many non-convex MIQP solvers around (Cplex has one). This model gives the same solutions as the MINLP model above. I am not showing the results here, as they are identical to the ones in the previous section.
Non-convex MIQP Model |
---|
\[\begin{align} \min & \sum_{i,j} \left( \color{darkred}x_{i,j} \frac{\color{darkblue}a_i}{\color{darkblue}b_j} - \color{darkred}y_{i,j}\right)^2 \\ & \sum_j \color{darkred} x_{i,j} = 1 &&\forall i\\ & \sum_i \color{darkred} x_{i,j} \le 1 &&\forall j\\ & \color{darkred} x_{i,j} = 0 \Rightarrow \color{darkred}y_{i,j} = 0 &&\forall i,j\\ & \color{darkred} x_{i,j} = 1 \Rightarrow \color{darkred}y_{i,j} = c &&\forall i,j\\ & \color{darkred}x_{i,j} \in \{0,1\}\\ & \color{darkred}y_{i,j} \ge 0 \\ & \color{darkred}c \ge 0 \end{align}\] |
We assumed here positive fractions \(a_i/b_j \gt 0\).This allows us to say \(c\ge 0\) and \(y_{i,j}\ge 0\).
A linear model
We can approximate the quadratic weighting of the residuals by absolute values: \[\sum_{i,j} \left| x_{i,j} \left( \frac{a_i}{b_j} - c \right) \right| \] The first thing to is to linearize the term \(x_{i,j} c\). As in the quadratic model, we introduce variables \[y_{i,j} = x_{i,j} c\] We can form the implications: \[\begin{align} & x_{i,j} = 0 \Rightarrow y_{i,j} = 0\\ &x_{i,j} = 1 \Rightarrow y_{i,j} = c\end{align}\] Most solvers support indicator constraints, which allows us to implement these implications as is. If we have a solver without this, we can formulate this as a bunch of big-\(M\) constraints: \[\begin{align} & - M x_{i,j} \le y_{i,j} \le M x_{i,j} \\ & c-M(1-x_{i,j}) \le y_{i,j} \le c+M(1-x_{i,j})\end{align}\] From the data we can assume \(a_{i}\ge 0\) and \(b_{j}\gt 0\). We can exploit this: \[\begin{align} & 0 \le y_{i,j} \le M_1 x_{i,j} \\ & c-M_2(1-x_{i,j}) \le y_{i,j} \le c+M_2(1-x_{i,j})\end{align}\] We can think a bit about good values for \(M_1\) and \(M_2\). I suggest: \[M_1 = M_2 = \max c = \max_{i,j} \frac{a_i}{b_j}\]
MIP Model |
---|
\[\begin{align} \min & \sum_{i,j} \color{darkred}r_{i,j} \\ & \sum_j \color{darkred} x_{i,j} = 1 && \forall i\\ & \sum_i \color{darkred} x_{i,j} \le 1 && \forall j\\ & -\color{darkred}r_{i,j} \le \color{darkred} x_{i,j} \frac{\color{darkblue} a_i}{\color{darkblue} b_j} - \color{darkred} y_{i,j} \le \color{darkred}r_{i,j} &&\forall i,j\\ & \color{darkred} y_{i,j} \le \color{darkblue} M_1 \color{darkred} x_{i,j} &&\forall i,j\\ & \color{darkred} c - \color{darkblue} M_2 (1- \color{darkred} x_{i,j}) \le \color{darkred} y_{i,j} \le \color{darkred} c + \color{darkblue} M_2 (1- \color{darkred} x_{i,j})&&\forall i,j \\ & \color{darkred}x_{i,j} \in \{0,1\}\\ & \color{darkred}c \ge 0 \\ & \color{darkred}y_{i,j} \ge 0\\ & \color{darkred}r_{i,j} \ge 0\\ \end{align}\] |
The results look like:
---- 72 VARIABLE x.L assignment j1 j3 j4 j5 j7 j8 j9 j10 j11 j12 i1 1.000 i2 1.000 i3 1.000 i4 1.000 i5 1.000 i6 1.000 i7 1.000 i8 1.000 i9 1.000 i10 1.000 ---- 72 VARIABLE c.L = 0.711 common value ---- 72 VARIABLE y.L c*x(i,j) j1 j3 j4 j5 j7 j8 j9 j10 j11 j12 i1 0.711 i2 0.711 i3 0.711 i4 0.711 i5 0.711 i6 0.711 i7 0.711 i8 0.711 i9 0.711 i10 0.711 ---- 72 VARIABLE r.L abs(residuals) j1 j3 j4 j5 j7 j8 j9 j10 j12 i1 0.006 i2 0.139 i3 0.010 i5 0.009 i6 0.021 i7 6.137042E-4 i8 0.147 i9 0.402 i10 0.002
The model results are very close. The assignments are the same. Just the \(c\) value and resulting sum of squared or absolute residuals are somewhat different.
---- 77 PARAMETER sumdev sum of squared/absolute deviations dev^2 |dev| c minlp model 0.201 0.784 0.695 mip model 0.204 0.737 0.711
A different approach
We can look at the problem in a different way. Instead of minimize the spread around a central value \(c\), we can minimize the bandwidth directly. I.e. we can write: \[\begin{align} \min\> & \mathit{maxv} - \mathit{minv} \\ & \mathit{minv} \le \frac{a_i}{b_j} && \forall x_{i,j}=1\\& \mathit{maxv} \ge \frac{a_i}{b_j} && \forall x_{i,j}=1 \end{align}\] These constraints can be implemented as indicator constraints \[\begin{align} & x_{i,j}=1 \Rightarrow \mathit{minv} \le \frac{a_i}{b_j}\\ & x_{i,j}=1 \Rightarrow \mathit{maxv} \ge \frac{a_i}{b_j}\end{align}\] If we don't have indicator constraints available we need to resort to big-\(M\) constraints:
Alternative MIP Model |
---|
\[\begin{align} \min\> & \color{darkred}{\mathit{maxv}} - \color{darkred}{\mathit{minv}} \\ & \sum_j \color{darkred} x_{i,j} = 1 && \forall i \\ & \sum_i \color{darkred} x_{i,j} \le 1 && \forall j \\ & \color{darkred}{\mathit{minv}} \le \frac{\color{darkblue}a_i}{\color{darkblue}b_j} + M (1-\color{darkred} x_{i,j}) && \forall i,j \\ & \color{darkred}{\mathit{maxv}} \ge \frac{\color{darkblue}a_i}{\color{darkblue}b_j} - M (1-\color{darkred} x_{i,j}) && \forall i,j \\ & \color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}{\mathit{maxv}},\color{darkred}{\mathit{minv}} \text{ free} \end{align}\] |
The solution looks like:
---- 103 VARIABLE x.L assignment j1 j2 j3 j4 j5 j6 j9 j11 j14 j15 i1 1.000 i2 1.000 i3 1.000 i4 1.000 i5 1.000 i6 1.000 i7 1.000 i8 1.000 i9 1.000 i10 1.000 ---- 103 VARIABLE minv.L = 0.308 minimum value VARIABLE maxv.L = 0.858 maximum value
A picture of this solution:
Minimize bandwidth solution |
The disadvantage of this method is that it does not care about assignments as long as they are inside our optimal bandwidth. We can see this if we recalculate an optimal central \(c\) value afterwards. The results with this reconstructed \(c\) value:
---- 114 PARAMETER sumdev sum of squared/absolute deviations dev^2 |dev| c minlp model 0.201 0.784 0.695 mip model 0.204 0.737 0.711 mip model 2 0.313 1.548 0.617
Indeed, we see that the total sum of squared or absolute deviations is not as good as we saw before. The same thing can be seen by just calculating the standard deviation for the selected ratios \(a_i/b_j\). This gives:
---- 130 PARAMETER stdev minlp model 0.149, mip model 0.149, mip model 2 0.186
Again we see this last approach leads to more variability.
In this above picture we see the results of this second MIP model not having an incentive to keep values close to each other, besides that they should be within \([\mathit{minv},\mathit{maxv}]\). This model is simpler than the earlier MIP model, but we pay a price: the solution is not as highly concentrated around the central point.
Distribution of selected a(i)/b(j) |
In this above picture we see the results of this second MIP model not having an incentive to keep values close to each other, besides that they should be within \([\mathit{minv},\mathit{maxv}]\). This model is simpler than the earlier MIP model, but we pay a price: the solution is not as highly concentrated around the central point.
Non-unique \(b_j\)
If we allow the \(b_j\) to be reused, only a small change in the models is needed. We just need to drop the constraint \[\sum_i x_{i,j} \le 1 \] When we do this, we may see a solution like:
This is turned out to be an interesting problem. A high-level MINLP problem is presented first. We can linearize the model into a MIP, but this becomes a bit messy. Modern MIP solvers offer tools (indicator constraints, built-in absolute value function) that can help here. Finally an alternative, simpler MIP model is proposed. However, this model produces solutions that are more dispersed.
Solution. Allow a column to be selected multiple times. |
Conclusion
This is turned out to be an interesting problem. A high-level MINLP problem is presented first. We can linearize the model into a MIP, but this becomes a bit messy. Modern MIP solvers offer tools (indicator constraints, built-in absolute value function) that can help here. Finally an alternative, simpler MIP model is proposed. However, this model produces solutions that are more dispersed.
No comments:
Post a Comment