## Monday, November 14, 2016

### Taxicab distances: L1 norm minimization

From this post:

Suppose we are given $$n$$ points $$A_1, \dots, A_n \in \mathbb{R}^2$$. The task is to find a point $$x = (x_1,x_2) \in \mathbb{R}^2$$ such that the sum of distances to the points $$A_1, \dots, A_n$$ in the $$\ell_1$$-norm is minimized. Formulate this problem as a linear program.

There are at least two (primal) formulations. A variable splitting technique can look like:

 \boxed{\begin{align} \min & \sum_{i,j} y^+_{i,j} + y^-_{i,j}\\ &y^+_{i,j} - y^-_{i,j} = x_j-A_{i,j}\\ &y^+_{i,j}\ge 0, y^-_{i,j}\ge 0\end{align}}

A different formulation is

 \boxed{\begin{align} \min & \sum_{i,j} y_{i,j}\\ &-y_{i,j} \le x_j-A_{i,j} \le y_{i,j}\\ &y_{i,j}\ge 0\end{align}}

I don’t think this formulation has a name. We can drop $$y_{i,j}\ge 0$$ if we want, as this is implied by $$-y_{i,j}\le y_{i,j}$$.

If you let mathematicians loose on this, you may end up with formulation 2 stated as:

 $\boxed{\begin{array}{ll} \text{minimize} & 1_{2n}^{\top} \mathrm y\\ \text{subject to} & -\mathrm y \leq (1_n \otimes \mathrm I_2) \, \mathrm x - \mbox{vec} (\mathrm A) \leq \mathrm y\\ & \mathrm y \geq 0_{2n}\end{array}}$

I don’t really like this type of notation: at least for me it is much less intuitive.

The first formulation has fewer equations but more variables. The second formulation has more constraints as we need to split $$-y_{i,j} \le x_j-A_{i,j} \le y_{i,j}$$ into $$x_j-A_{i,j} \ge –y_{i,j}$$ and $$x_j-A_{i,j} \le y_{i,j}$$. I don’t think there is much reason to choose one formulation above the other. Depending largely on my mood I use either version 1 or 2.