Monday, May 1, 2017

Linearizing an average

In (1) a question is asked about averaging a certain allocation.

Let \(i\) be the set of prices to allocate to buckets \(j\). As each price can go to at most one bucket, it makes sense to use a binary variable:

\[x_{i,j} = \begin{cases} 1 & \text{if price $i$ is allocated to bucket $j$}\\
                                   0 & \text{otherwise}\end{cases}\]

The objective is to minimize the spread of the average prices seen in each bucket. A (nonlinear) model can look like:

\[
\bbox[lightcyan,10px,border:3px solid darkblue]
{\begin{align} \min\>& \left(\max_j  \mu_j – \min_j \mu_j \right)\\
                             & \mu_j = \frac{\displaystyle\sum_i p_i x_{i,j}}{\displaystyle\sum_i x_{i,j}}\\
                             &\sum_j x_{i,j} = 1 & \forall i\\
                             &x_{i,j} \in \{0,1\}
\end{align}}\]

Here the variable \(\mu_j\) is the average price in bucket \(j\). We have two nonlinearities in this model: the objective, and the calculation of \(\mu_j\).

The objective is easily linearized:

\[\begin{align}
\min\> & \mu_{max} – \mu_{min}\\
         & \mu_{max} \ge \mu_j & \forall j\\
         & \mu_{min} \le \mu_j & \forall j
\end{align}\]

But what about the calculation of the average? We can write:

\[\mu_j \sum_i x_{i,j} = \sum_i p_i x_{i,j} \]

This does not help much. If we introduce \(z_j = \sum_i x_{i,j}\) then the left-hand side is \(\mu_j z_j\) which is still not easy to linearize. However, a better approach is to write

\[\sum_i \mu_j x_{i,j} = \sum_i p_i x_{i,j} \]

The expressions \(\mu_j x_{i,j}\) are still non-linear, but they are a multiplication of a binary variables times a continuous variable. This we know how to linearize (2).

The complete linearized model can look like:

\[\bbox[lightcyan,10px,border:3px solid darkblue]{\begin{align}
\min\> & \mu_{max} – \mu_{min}\\
         & \mu_{max} \ge \mu_j & \forall j\\
         & \mu_{min} \le \mu_j & \forall j \\
         & \sum_i y_{i,j} = \sum_i p_i x_{i,j} & \forall j\\
         & y_{i,j} \le U x_{i,j}\\
         & y_{i,j} \le \mu_j\\
         & y_{i,j} \ge \mu_j – U(1-x_{i,j})\\
         &\sum_j x_{i,j} = 1 & \forall i\\
         &y_{i,j} \in [0,U]\\
         &x_{i,j} \in \{0,1\}
\end{align}}\]

where \(U=\displaystyle \max_i p_i\) (this is data). The variable \(y_{i,j}\) can be interpreted as \(y_{i,j}=\mu_j x_{i,j}\).

In a real application we would have other restrictions as well. The most obvious is that we want to allocate at least one price to a bucket. This is simply done by requiring \(\displaystyle\sum_i x_{i,j} \ge 1\). Here is some example output while including this constraint:

----     53 SET i  prices

i1 ,    i2 ,    i3 ,    i4 ,    i5 ,    i6 ,    i7 ,    i8 ,    i9
i10


----     53 SET j  buckets

j1,    j2,    j3,    j4,    j5


----     53 PARAMETER p  price

i1  1.717,    i2  8.433,    i3  5.504,    i4  3.011,    i5  2.922
i6  2.241,    i7  3.498,    i8  8.563,    i9  0.671,    i10 5.002


----     53 VARIABLE x.L  assignment

             j1          j2          j3          j4          j5

i1            1
i2                        1
i3                                    1
i4                                                1
i5            1
i6                                    1
i7                                                            1
i8            1
i9                        1
i10                                               1


----     53 VARIABLE mu.L  average

j1 4.401,    j2 4.552,    j3 3.872,    j4 4.007,    j5 3.498


----     53 VARIABLE y.L  average or zero

             j1          j2          j3          j4          j5

i1        4.401
i2                    4.552
i3                                3.872
i4                                            4.007
i5        4.401
i6                                3.872
i7                                                        3.498
i8        4.401
i9                    4.552
i10                                           4.007


----     53 VARIABLE mumin.L               =        3.498 
            VARIABLE mumax.L               =        4.552 

Side note: What would happen if we allow zero prices to go into a bucket? The model would not restrict \(\mu_j\) for such a bucket. In extremis we can put all prices in the first bucket (with average price \(\mu_1\)), and then have all the remaining buckets empty. That would allow the model to pick \(\mu_j=\mu_1\) i.e. no spread. That is exactly what I got when I tried this (by dropping the constraint \(\displaystyle\sum_i x_{i,j} \ge 1\)). The results with the same data as before:

----     53 VARIABLE x.L  assignment

             j1

i1            1
i2            1
i3            1
i4            1
i5            1
i6            1
i7            1
i8            1
i9            1
i10           1


----     53 VARIABLE mu.L  average

j1 4.156,    j2 4.156,    j3 4.156,    j4 4.156,    j5 4.156


----     53 VARIABLE y.L  average or zero

             j1

i1        4.156
i2        4.156
i3        4.156
i4        4.156
i5        4.156
i6        4.156
i7        4.156
i8        4.156
i9        4.156
i10       4.156


----     53 VARIABLE mumin.L               =        4.156 
            VARIABLE mumax.L               =        4.156 

The same trick can be used to linearize the problem stated in (3).

References
  1. Fair allocations to buckets with constraints, http://stackoverflow.com/questions/43662264/fair-allocation-to-buckets-with-constraints
  2. Multiplication of a continuous and a binary variable, http://yetanothermathprogrammingconsultant.blogspot.com/2008/05/multiplication-of-continuous-and-binary.html 
  3. Convert nonlinear objective function to linear objective function, http://stackoverflow.com/questions/43591174/convert-nonlinear-objective-function-to-linear-objective-function