To illustrate, I generated a random LP where I could fiddle with the density: the number of nonzero elements \(nz\) as fraction of \(n \times m\) (\(n\) = number of variables, \(m\) is numbers of equations). To be precise: \[\mathit{density}=\frac{nz}{n \times m} 100\%\] In my case I fixed: \(m=2,000\) and \(n=5,000\) and varied the density from 100% to 10% to 1%.
The results look like:
density | nz | Simplex iterations | Simplex seconds | Barrier iterations | Barrier seconds |
---|---|---|---|---|---|
100% | 10,000,000 | 9532 | 295 | 15 | 484 |
10% | 1,004,863 | 5297 | 40 | 13 | 20 |
1% | 104,799 | 5751 | 13 | 13 | 8 |
These timings were with Cplex but you will see the same behavior using your favorite LP solver.
All the problems have the same "size" in terms of rows and columns. Clearly the number of nonzeros is an important factor. We see the same pattern when we use the (dual) Simplex method, or an interior point algorithm: the time per iteration goes up significantly if the problems are denser. LP solvers are designed for large, sparse problems, so expect increased solution times when the problem becomes dense.
Luckily, practical LP models (implemented correctly) have a tendency to be very sparse. Typically a column has fewer than 5-10 nonzeros. In practice that means, that the density for most LP models is much less than 1%.
It is my belief that issues like this do not get enough emphasis in linear programming classes. May be too much time is devoted to learning full-tableau simplex methods (in my opinion that is like spending the first 5 chapters of the aeronautics book on the steam engine).
No comments:
Post a Comment