In some cases a new version of a solver can perform worse on some models. Here is a particular unfortunate example, where Cplex 12.1 finds several very good solutions in the initial heuristics, but Cplex 12.2 just fails to find any integer solutions within the allotted time:
ILOG CPLEX Nov 1, 2009 23.3.3 WEX 13908.15043 WEI x86_64/MS Windows Cplex MIP uses 1 of 4 parallel threads. Change default with option THREADS. Nodes Cuts/ 0 0 15053.5468 150 15053.5468 151633 Flow cuts applied: 472 |
IBM ILOG CPLEX Jul 4, 2010 23.5.1 WEX 18414.18495 WEI x86_64/MS Windows Cplex MIP uses 1 of 4 parallel threads. Change default with option THREADS. Nodes Cuts/ 0 0 15053.5468 111 15053.5468 120941 |
Such behavior is always difficult to explain to a customer. The argument that on average a new version tends to do better is of little consolation.
PS. The time limit was very tight on this model: 400 seconds (extremely tight considering the LP takes 100 seconds). This was no problem with previous versions of Cplex as the initial heuristic always found solutions very quickly. These solutions were so good they already obeyed the requested gap of 0.8%. Cplex 12.2 finds the solution relatively quickly in the B&B after about 500 seconds on my machine with 1 thread (the client on his machine obtained no solution in 600 seconds with 2 threads and some other solver options; I assume a little bit more time may help here).
PS2. All my runs were with default settings (except 1 thread, gap=0.8%, time limit=400 seconds).
I have struggled with many of such customer arguments. But I think it's an unreasonable demand, that performance should not degrade on any model. It is simply completely impossible. I know cplex takes benchmarking very seriously. I have been to an benchmark talk by Tobias Achterberg from cplex, they have a *huge* collection of customer test models, that must take ages to run.
ReplyDeleteIs the time limit for the second run the same as the total solution time for the first run? It seems 12.2 got a slightly tighter lower bound than 12.1 did, but that's not very meaningful if it took longer to do.
ReplyDeletePresumably you kept any non-default parameter values the same across both runs. I wonder if any defaults changed in 12.2?
I agree with Bo; if you come up with an improvement that works on 95% of your customer models, you will hear primarily from those 5% of customers who don't see any improvement (or perhaps see a decline) in performance. But, you still should go ahead and implement the feature.
ReplyDeleteFor MIPs, the growth in heuristics has a tendency to increase the performance variability. In the log files above, all the feasible solutions came from heuristics.
One useful test to distinguish cases of performance variability from cases where one version really does better involves solving the root algorithm differently. By solving the initial relaxation with primal simplex or barrier method, you typically get a different starting basis, different cuts, and different heuristic behavior. So, for the model above, if you see consistent results even after changing the root node LP solve in this way, that suggests that 12.1 really does better on this model. On the other hand, if you see wild variations in performance by doing this, that suggests that the model has a lot of performance variability, and that care must be taken drawing conclusions from a single comparison.
In general, when I see a log like the one above, I would recommend increasing CPLEX's heuristic frequency parameter, to something like 5 nodes. Increased heuristic application will tend to reduce any variability associated with heuristics. Also, any settings that can tighten the root node (e.g. more aggressive probing and cuts) might yield more consistent results by reducing the number of integer infeasibilities.
If you are welling to send us the model (klotz at us dot ibm dot com), we'll see what we can do with 12.2.