Saturday, December 17, 2022

Comparing matrix balancing objectives

The matrix balancing problem can be described as [1]: find a non-negative matrix \(\color{darkred}A_{i,j}\), that is as close as possible to \(\color{darkblue}A^0_{i,j}\), while observing given row and column totals and a given sparsity pattern. Or, mathematically,

Matrix Balancing Problem
\[\begin{align}\min_{\color{darkred}A}\>&{\bf{dist}}(\color{darkred}A,\color{darkblue}A^0)\\ & \sum_i \color{darkred}A_{i,j} = \color{darkblue}c_j && \forall j\\ & \sum_j \color{darkred}A_{i,j} = \color{darkblue}r_i && \forall i \\&\color{darkred}A_{i,j}=0 &&\forall i,j|\color{darkblue}A^0_{i,j}=0\\ &\color{darkred}A_{i,j}\ge 0 \end{align} \]

There are numerous ways to specify the objective. Here we focus on three popular ones:
  • Cross-entropy: \(\displaystyle{\min \sum_{i,j} \color{darkred}A_{i,j} \ln \frac{\color{darkred}A_{i,j}}{\color{darkblue}A^0_{i,j}}}\),
  • Quadratic: \(\displaystyle{\min \sum_{i,j} \left(\color{darkred}A_{i,j}-\color{darkblue}A^0_{i,j}\right)^2}\),
  • Relative Quadratic: \(\displaystyle{\min \sum_{i,j} \left(\frac{\color{darkred}A_{i,j}}{\color{darkblue}A^0_{i,j}}-1\right)^2}\)

One interesting way is to look at the distributions of the errors \(\color{darkred}A-\color{darkblue}A^0\) (only for nonzero \(\color{darkblue}A^0_{i,j}\)). The technique used here is a smoothed kernel density estimation (KDE). We created a very large sparse matrix with some random data. Using these three different objectives, we see the following pattern:

The relative quadratic objective has more large errors and more very small ones. The absolute quadratic objective does not have large errors (they are penalized heavily), but to compensate for that, it cannot create as many very small errors. In a sense, it is a game of whack-a-mole. Pushing down on errors in a certain region will increase them somewhere else.

The entropy objective is nestled nicely between the two quadratic distributions. 


  • I think this is a rather appealing visualization, giving insight into how these objectives behave. Missing in this is a sense where these errors appear: for small \(\color{darkblue}A^0_{i,j}\) or large ones. I am not sure how to depict that.
  • A colleague of mine prefers to use a linear combination of the absolute and relative quadratic objective. This plot provides some support for that idea.
  • When using an absolute value objective, \[\min \sum_{i,j} \left|\color{darkred}A_{i,j}-\color{darkblue}A^0_{i,j}\right|\] we can use high-performance LP solvers (after reformulation). The results are typically many zero errors and many very large errors. 
  • For large problems, make sure to use interior point solvers. For best performance, turn off the crossover algorithm. Note that when using concurrent LP solvers, the screen log may not show that the crossover is becoming a bottleneck.
  • The plot was made using ggplot's geom_density. This is essentially a continuous version of a histogram. I think the curves are normalized, so the integral is 1. I don't think that distorts my data as I have the same number of observations for all cases (so the same area before normalization).



1 comment:

  1. On Note 1, I think absolute errors could be plotted against original $A^0_{i,j}$ and further classified by color for the original sign ($original A^0$ positive/negative) and shape for the error position in the distribution (for example quartile placement 1Q/2Q/3Q/4Q).
    The cross-entropy objective implies the original matrix is also non-negative, while other objectives are free in this sense.