## Monday, May 1, 2017

### Linearizing an average

In (1) a question is asked about averaging a certain allocation.

Let $$i$$ be the set of prices to allocate to buckets $$j$$. As each price can go to at most one bucket, it makes sense to use a binary variable:

 $x_{i,j} = \begin{cases} 1 & \text{if price i is allocated to bucket j}\\ 0 & \text{otherwise}\end{cases}$

The objective is to minimize the spread of the average prices seen in each bucket. A (nonlinear) model can look like:

 \bbox[lightcyan,10px,border:3px solid darkblue]{\begin{align} \min\>& \left(\max_j \mu_j – \min_j \mu_j \right)\\ & \mu_j = \frac{\displaystyle\sum_i p_i x_{i,j}}{\displaystyle\sum_i x_{i,j}}\\ &\sum_j x_{i,j} = 1 & \forall i\\ &x_{i,j} \in \{0,1\}\end{align}}

Here the variable $$\mu_j$$ is the average price in bucket $$j$$. We have two nonlinearities in this model: the objective, and the calculation of $$\mu_j$$.

The objective is easily linearized:

 \begin{align}\min\> & \mu_{max} – \mu_{min}\\ & \mu_{max} \ge \mu_j & \forall j\\ & \mu_{min} \le \mu_j & \forall j\end{align}

But what about the calculation of the average? We can write:

 $\mu_j \sum_i x_{i,j} = \sum_i p_i x_{i,j}$

This does not help much. If we introduce $$z_j = \sum_i x_{i,j}$$ then the left-hand side is $$\mu_j z_j$$ which is still not easy to linearize. However, a better approach is to write

 $\sum_i \mu_j x_{i,j} = \sum_i p_i x_{i,j}$

The expressions $$\mu_j x_{i,j}$$ are still non-linear, but they are a multiplication of a binary variables times a continuous variable. This we know how to linearize (2).

The complete linearized model can look like:

 \bbox[lightcyan,10px,border:3px solid darkblue]{\begin{align}\min\> & \mu_{max} – \mu_{min}\\ & \mu_{max} \ge \mu_j & \forall j\\ & \mu_{min} \le \mu_j & \forall j \\ & \sum_i y_{i,j} = \sum_i p_i x_{i,j} & \forall j\\ & y_{i,j} \le U x_{i,j}\\ & y_{i,j} \le \mu_j\\ & y_{i,j} \ge \mu_j – U(1-x_{i,j})\\ &\sum_j x_{i,j} = 1 & \forall i\\ &y_{i,j} \in [0,U]\\ &x_{i,j} \in \{0,1\} \end{align}}

where $$U=\displaystyle \max_i p_i$$ (this is data). The variable $$y_{i,j}$$ can be interpreted as $$y_{i,j}=\mu_j x_{i,j}$$.

In a real application we would have other restrictions as well. The most obvious is that we want to allocate at least one price to a bucket. This is simply done by requiring $$\displaystyle\sum_i x_{i,j} \ge 1$$. Here is some example output while including this constraint:

 ----     53 SET i  prices i1 ,    i2 ,    i3 ,    i4 ,    i5 ,    i6 ,    i7 ,    i8 ,    i9 i10 ----     53 SET j  buckets j1,    j2,    j3,    j4,    j5 ----     53 PARAMETER p  price i1  1.717,    i2  8.433,    i3  5.504,    i4  3.011,    i5  2.922i6  2.241,    i7  3.498,    i8  8.563,    i9  0.671,    i10 5.002 ----     53 VARIABLE x.L  assignment              j1          j2          j3          j4          j5 i1            1i2                        1i3                                    1i4                                                1i5            1i6                                    1i7                                                            1i8            1i9                        1i10                                               1 ----     53 VARIABLE mu.L  average j1 4.401,    j2 4.552,    j3 3.872,    j4 4.007,    j5 3.498 ----     53 VARIABLE y.L  average or zero              j1          j2          j3          j4          j5 i1        4.401i2                    4.552i3                                3.872i4                                            4.007i5        4.401i6                                3.872i7                                                        3.498i8        4.401i9                    4.552i10                                           4.007 ----     53 VARIABLE mumin.L               =        3.498              VARIABLE mumax.L               =        4.552

Side note: What would happen if we allow zero prices to go into a bucket? The model would not restrict $$\mu_j$$ for such a bucket. In extremis we can put all prices in the first bucket (with average price $$\mu_1$$), and then have all the remaining buckets empty. That would allow the model to pick $$\mu_j=\mu_1$$ i.e. no spread. That is exactly what I got when I tried this (by dropping the constraint $$\displaystyle\sum_i x_{i,j} \ge 1$$). The results with the same data as before:

 ----     53 VARIABLE x.L  assignment              j1 i1            1i2            1i3            1i4            1i5            1i6            1i7            1i8            1i9            1i10           1 ----     53 VARIABLE mu.L  average j1 4.156,    j2 4.156,    j3 4.156,    j4 4.156,    j5 4.156 ----     53 VARIABLE y.L  average or zero              j1 i1        4.156i2        4.156i3        4.156i4        4.156i5        4.156i6        4.156i7        4.156i8        4.156i9        4.156i10       4.156 ----     53 VARIABLE mumin.L               =        4.156              VARIABLE mumax.L               =        4.156

The same trick can be used to linearize the problem stated in (3).

##### References
1. Fair allocations to buckets with constraints, http://stackoverflow.com/questions/43662264/fair-allocation-to-buckets-with-constraints
2. Multiplication of a continuous and a binary variable, http://yetanothermathprogrammingconsultant.blogspot.com/2008/05/multiplication-of-continuous-and-binary.html
3. Convert nonlinear objective function to linear objective function, http://stackoverflow.com/questions/43591174/convert-nonlinear-objective-function-to-linear-objective-function