Monday, April 18, 2011

Large transportation problem

I was rerunning a large instance of a test problem with GAMS/MOSEK. As it is a network problem (a transportation model in fact; it models how to assign people to jobs after a reorganization), it makes sense to try a few different algorithms.

Statistics (incl obj) Rows 4,601
  Columns 4,800,001
  NZ 13,020,000
Timings (sec) GAMS generation 40
  MOSEK default (interior point) 105
  MOSEK primal simplex 57
  MOSEK network 49

This is with default GAMS-solver communication (i.e. files). With DLL usage (no files) the times for GAMS and MOSEK are slightly larger (I don’t know why this is the case: in-core data structures should in general – well actually always – be faster than disk files; I suspect there is something wrong with the new “load library” link).

Update: a little more investigation shows that GAMS still writes a very large socalled dictionary file even if we use solvelink = 5 (load library). The fastest way seems to drop m.solvelink=5 (still don’t understand why that option is not faster) but to include m.dictfile=0. This yields: GAMS generation time: 31 seconds, MOSEK network code: 44 seconds. I.e. GAMS still generates this large network model a little bit quicker than the fastest network solver I have access to can solve it.


  1. I am surprised that the interior-point does so well. You should expect with the large number columns compared to the number rows that it would be slow. Also MOSEK default interior-point runs in 1 thread, so using multiple threads may reduce time.

    Dual simplex was much slower or not tried?

  2. I always had the impression network problems solved quite efficiently using interior point methods. May be compared to dual simplex this can be true.

    MOSEK Dual simplex took 215 seconds on this problem.

  3. If the network problems are huge, then interior point will run into memory problems in my experience (of course machine dependent).

    Dual simplex will do very well (or even outperform) network simplex on problems with ranged variables and/or hard to find primal feasible solutions. Most (all) the network implementations are primal based, which may not be the best decision

  4. An interior-point method has to factor ADA' in every iteration (D is diagonal matrix). You can prove if the underlying network is dense then the factor is going to be dense. This happens even though A is very sparse.

    Most likely your network problems have always been sparse because that is normally the case for real world problems.

    Btw even dual simplex does quite well. Since you have 1000 times more columns than rows I would have assumed that would kill dual simplex.