Sunday, June 17, 2012

Good polling is difficult

From the Huffington Post, a surprisingly detailed exposé on non-response in polling on the US precedential race:

Tuesday, June 5, 2012

Multiple solutions

Many questions are posed about finding a number or all optimal solutions of an LP / MIP problem. Of course this is not always a well-defined problem especially for the LP case. One could reinterpret the question as: give me a number of different optimal bases, but even that may give a large number of identical primal solutions (degeneracy). Probably what is meant is: give me different optimal bases where the primal solution is different.

For a MIP probably what is meant: give me optimal (or close to optimal) solutions that have different integer configurations. I.e. ignore any continuous variables that are possibly present in the model.

Modelers have been struggling with this for a long time. Here are some notable papers:

  • Chemical Engineering Application:

    Recursive MILP model for finding all the alternate optima in LP models for metabolic networks
    by: S. Lee, C. Phalakornkule, M. M. Domach, I. E. Grossmann
    Computers and Chemical Engineering, Vol. 24 (2000),

  • Agricultural Economics:

    Multiple Optimal Solutions in Linear Programming Models, by Quirino Paris, American J. of Agricultural Economics, vol 63 (1981). Also see the comments on this paper by McCarl e.a. and others and the rebuttals.

For a MIP some solvers have tools to help with this. Cplex has a solution pool facility (see the posts and In addition there are interesting modeling tricks to forbid an existing integer solution (but that would require to repeatedly solve a model).

I often believe that asking for many solutions may be the wrong question. If you want to present a (large) number of solutions and let the user pick the best one it may be better to investigate how this works, and incorporate this in the model.

As an example, scheduling models often have (very) different solutions with almost the same objective. It may make sense to choose then a schedule that is close to the schedule calculated in a previous run. To keep solutions close to each other -- unless there is a big cost involved -- is sometimes called “persistency”. It often make sense to add a facility for this in scheduling models, as operators don’t like to look at schedules that change a lot without reason. I don’t see this aspect mentioned much in the literature