Wednesday, July 4, 2018

LP problems and the number of nonzero elements

I was reviewing an LP model and I was able to reduce the number of nonzero elements in the LP matrix by a large amount. Many new LP modelers focus too much on the number of variables and the number of equations as a measure of size. The number of nonzero elements is a much more important quantity to watch.

To illustrate, I generated a random LP where I could fiddle with the density: the number of nonzero elements \(nz\) as fraction of \(n \times m\) (\(n\) = number of variables, \(m\) is numbers of equations). To be precise: \[\mathit{density}=\frac{nz}{n \times m} 100\%\] In my case I fixed: \(m=2,000\) and \(n=5,000\) and varied the density from 100% to 10% to 1%.

The results look like:

densitynzSimplex
iterations
Simplex
seconds
Barrier
iterations
Barrier
seconds
100%10,000,000953229515484
10%1,004,8635297401320
1%104,799575113138

These timings were with Cplex but you will see the same behavior using your favorite LP solver.

All the problems have the same "size" in terms of rows and columns. Clearly the number of nonzeros is an important factor. We see the same pattern when we use the (dual) Simplex method, or an interior point algorithm: the time per iteration goes up significantly if the problems are denser. LP solvers are designed for large, sparse problems, so expect increased solution times when the problem becomes dense.

Luckily, practical LP models (implemented correctly) have a tendency to be very sparse. Typically a column has fewer than 5-10 nonzeros. In practice that means, that the density for most LP models is much less than 1%.

It is my belief that issues like this do not get enough emphasis in linear programming classes. May be too much time is devoted to learning full-tableau simplex methods (in my opinion that is like spending the first 5 chapters of the aeronautics book on the steam engine).