For a very large large MCP model (a system of nonlinear equations with complementarity conditions) we observed it became unsolvable, even with a very good starting point. The reason was that the model was kept as small as possible in terms of equations and variables. As a result the model was much more dense and nonlinear (in terms of nonlinear nonzero elements) than needed. Basically, modeling systems and solvers that exploit sparsity should be fed large, sparse models instead of corresponding smaller, denser models. After substituting out some common expressions into separate equations (hence increasing the size in terms of equations and variables) we could solve the larger models (≈400K variables and equations) fairly easily.
An linear example of such a problem is shown here: http://yetanothermathprogrammingconsultant.blogspot.com/2010/04/l1-portfolio-formulation.html.
An example of a nonlinear construct that should raise a red flag is something like:
In many cases it is better to write this as:
This will keep the number of nonlinear nonzero elements smaller at the expense of extra variables and equations.
In the model I was looking at a combination of these two situations was implemented: a large linear expression inside a nonlinear expression, and this nonlinear expression was repeated several times in the model. This caused both extra nonzero elements, and they appeared also in very nonlinear fashion.
For a smaller version of the model, the differences in statistics is striking (I was unable to get the large version to solve with the original formulation):
old model | MODEL STATISTICS BLOCKS OF EQUATIONS 25 SINGLE EQUATIONS 106,219 BLOCKS OF VARIABLES 25 SINGLE VARIABLES 106,236 NON ZERO ELEMENTS 5,238,688 NON LINEAR N-Z 5,034,974 DERIVATIVE POOL 1,828 CONSTANT POOL 104,745 CODE LENGTH 120,759,139 GENERATION TIME = 52.151 SECONDS EXECUTION TIME = 52.463 SECONDS RESOURCE USAGE, LIMIT 281.806 1000.000 |
new formulation | MODEL STATISTICS BLOCKS OF EQUATIONS 26 SINGLE EQUATIONS 107,605 BLOCKS OF VARIABLES 26 SINGLE VARIABLES 107,622 NON ZERO ELEMENTS 768,295 NON LINEAR N-Z 561,809 DERIVATIVE POOL 1,859 CONSTANT POOL 104,745 CODE LENGTH 14,348,068 GENERATION TIME = 7.566 SECONDS EXECUTION TIME = 7.972 SECONDS RESOURCE USAGE, LIMIT 57.216 1000.000 |
Unfortunately solvers and modeling systems don’t give much feedback on these issues. Such a diagnoses requires some modeling knowledge and experience (one reason why I am sometimes hired to help out).