## Sunday, October 3, 2010

### Large NLP/MCP models

For a very large large MCP model (a system of nonlinear equations with complementarity conditions) we observed it became unsolvable, even with a very good starting point. The reason was that the model was kept as small as possible in terms of equations and variables. As a result the model was much more dense and nonlinear (in terms of nonlinear nonzero elements) than needed. Basically, modeling systems and solvers that exploit sparsity should be fed large, sparse models instead of corresponding smaller, denser models. After substituting out some common expressions into separate equations (hence increasing the size in terms of equations and variables) we could solve the larger models (≈400K variables and equations) fairly easily.

An linear example of such a problem is shown here: http://yetanothermathprogrammingconsultant.blogspot.com/2010/04/l1-portfolio-formulation.html.

An example of a nonlinear construct that should raise a red flag is something like:

In many cases it is better to write this as:

This will keep the number of nonlinear nonzero elements smaller at the expense of extra variables and equations.

In the model I was looking at a combination of these two situations was implemented: a large linear expression inside a nonlinear expression, and this nonlinear expression was repeated several times in the model. This caused both extra nonzero elements, and they appeared also in very nonlinear fashion.

For a smaller version of the model, the differences in statistics is striking (I was unable to get the large version to solve with the original formulation):

 old model MODEL STATISTICS BLOCKS OF EQUATIONS          25     SINGLE EQUATIONS      106,219 BLOCKS OF VARIABLES          25     SINGLE VARIABLES      106,236 NON ZERO ELEMENTS     5,238,688     NON LINEAR N-Z      5,034,974 DERIVATIVE POOL           1,828     CONSTANT POOL         104,745 CODE LENGTH         120,759,139 GENERATION TIME      =       52.151 SECONDS EXECUTION TIME       =       52.463 SECONDS RESOURCE USAGE, LIMIT        281.806      1000.000 new formulation MODEL STATISTICS BLOCKS OF EQUATIONS          26     SINGLE EQUATIONS      107,605 BLOCKS OF VARIABLES          26     SINGLE VARIABLES      107,622 NON ZERO ELEMENTS       768,295     NON LINEAR N-Z        561,809 DERIVATIVE POOL           1,859     CONSTANT POOL         104,745 CODE LENGTH          14,348,068 GENERATION TIME      =        7.566 SECONDS EXECUTION TIME       =        7.972 SECONDS RESOURCE USAGE, LIMIT         57.216      1000.000

Unfortunately solvers and modeling systems don’t give much feedback on these issues. Such a diagnoses requires some modeling knowledge and experience (one reason why I am sometimes hired to help out).