In this post, I want to discuss an observation about the root node when solving a MIP model.
Problem Description
The model I want to tackle here is from [1]:
We have a large matrix A=ai,j with values 0 and 1. In addition, we have a minimum on the row and column totals. These are called ri and cj. The goal is to remove as many 1's in the matrix A subject to these minimum row and column totals.
Mathematical Model
The mathematical model for this problem is not very difficult. We define:
xi,j={1if we need to keep the corresponding value ai,j=10if we can drop the corresponding value ai,j=1 We don't consider the cases where ai,j=0.
Now, we can write:
Mathematical Model |
min∑i,j|ai,j=1xi,j∑j|ai,j=1xi,j≥ri∀i∑i|ai,j=1xi,j≥cj∀jxi,j∈{0,1} |
If you prefer, we can also write this as:
Mathematical Model |
min∑i,jai,j⋅xi,j∑jai,j⋅xi,j≥ri∀i∑iai,j⋅xi,j≥cj∀jxi,j∈{0,1} |
The post in [1] mentions that the goal is to solve this model with a matrix size of 20,000 rows and 500 columns. As I used a fraction of 10% ones in the matrix A, the number of (discrete) variables in the model is close to million: 999,469 (we could have predicted this from 20,000×500×0.1). The number of constraints is 20,500.
Observation
Branch-and-bound algorithms start with solving a relaxed model. This is a model where all integer restrictions are ignored and thus becomes an LP. In some cases, notably when the problem is a network model, this relaxed LP automatically yields integer-valued solutions. This is such a case. A network model is characterized by:
- All matrix coefficients are −1 or +1 (check),
- the right-hand-side should be integer valued (check),
- there are two nonzero coefficients in each column of the constraint matrix (check),
- and these two nonzero coefficients have the value −1 and +1.
The last condition is not met. But, by multiplying the last constraint by −1, we achieve what we want. Cplex will do this actually automatically when invoking the network solver.
If we solve such a model as an LP or as a network model, the solution will be automatically integral. If we throw this into a MIP solver, we expect the solver to start with the relaxation, and then stop quickly because the solution is integer-valued. There is no need for branching. I.e., solving the MIP and the LP should take about the same time. That is not the case here:
---- 88 PARAMETER results
MIP LP NETWORK
status Optimal Optimal Optimal
obj 504711.000 504711.000 504711.000
time 5141.515 14.437 4.000
iter 571595.000 38276.000
nodes NA NA
The MIP is not branching (the node count is zero), but it spends a lot of time preprocessing just after solving the root node. The number of iterations for the network solver is incorrectly reported as zero here. It actually did 1,235,117 iterations.
This may be related to how the presolve behaves. It may do things slightly differently when it knows the variables are discrete. However, when looking at the log, the first best bound is unbelievably poor: it is zero!
Iteration log . . .
Iteration: 1 Dual objective = 1079.000000
Perturbation started.
Iteration: 606 Dual objective = 499871.000000
Iteration: 2167 Dual objective = 499871.008110
Iteration: 3647 Dual objective = 499871.016553
. . .
Iteration: 33697 Dual objective = 504711.126635
Iteration: 33962 Dual objective = 504711.126636
Removing perturbation.
Primal simplex solved model.
Root relaxation solution time = 16.92 sec. (4357.36 ticks)
Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap
* 0+ 0 999469.0000 0.0000 100.00%
Found incumbent of value 999469.000000 after 20.62 sec. (6918.29 ticks)
* 0+ 0 522140.0000 0.0000 100.00%
Found incumbent of value 522140.000000 after 5140.89 sec. (6925.92 ticks)
* 0 0 integral 0 504711.0000 504711.0000 571595 0.00%
Elapsed time = 5140.97 sec. (7024.08 ticks, tree = 0.00 MB)
Found incumbent of value 504711.000000 after 5140.97 sec. (7024.08 ticks)
Root node processing (before b&c):
Real time = 5140.98 sec. (7037.52 ticks)
Parallel b&c, 16 threads:
Real time = 0.00 sec. (0.00 ticks)
Sync time (average) = 0.00 sec.
Wait time (average) = 0.00 sec.
------------
Total (root+branch&cut) = 5140.98 sec. (7037.52 ticks)
I have no idea why this is the case. This would mean the relaxed objective is zero. I don't believe that to be the case.
The actual LP log for the root is not listed as Cplex solves the root using several different LP algorithms in parallel. The first one to finish seems to be primal simplex, while the log is from the dual simplex method. But still, I can't imagine that the primal simplex solves the root node with an optimal objective of 0. (Update: as speculated below, the first few lines in the MIP log may be executed before the root LP was solved.)
Conclusion
One thing is clearly demonstrated here: if the relaxed LP is automatically integral, solving as a MIP can be very slow. In this case, we talk about 5000 seconds vs. 14 seconds. The network extraction and subsequent solving are even faster: 4 seconds.
I am not sure why solving as a MIP is slower than solving as an LP. Sometimes the argument is that the presolve behaves differently (more reductions when it is known that the variables are integers). If that prevents the solution of the root node from being integer-valued, we certainly paid a very high price for that. Another cause could be that Cplex is just using some very expensive heuristics to find integer solutions "quickly" before it bothers to inspect the solution of the relaxed problem. Heuristics need to be fast to be useful as part of a solution strategy. Using a heuristic that takes more than an hour for a problem that solves in a few seconds is not the smartest thing to do.
In the end, I think the MIP formulation is more precise. It conveys my intention better. The solver should really not punish me for that.
Update: some other solvers do not seem to have this extreme behavior. They solve the MIP and LP models in about the same time. It looks like the poor performance on the MIP model is a particularity of Cplex.
References
Appendix: GAMS model
In GAMS, we can solve a relaxed MIP model by using an RMIP model.
$onText A very large MIP. It takes some time to solve. However the LP is actually automatically integer valued. This means the relaxation is no good? Finally we also solve as a network model. Reference: https://stackoverflow.com/questions/76105697/efficiently-remove-a-maximum-amount-of-binary-elements-while-keeping-row-and-col $offText *----------------------------------------------------------------------------------- * Data *----------------------------------------------------------------------------------- set i /i1*i20000/ j /j1*j500/ ; set a(i,j) 'randomly filled with 10% ones'; a(i,j)$(uniform(0,1)<0.1) = YES; * calculate rows and column sums parameter rowsum(i), colsum(j); rowsum(i) = sum(a(i,j),1); colsum(j) = sum(a(i,j),1); parameter r(i),c(j); r(i) = ceil(rowsum(i)/2); c(j) = ceil(colsum(j)/2); *----------------------------------------------------------------------------------- * MIP Model *----------------------------------------------------------------------------------- binary variable x(i,j) 'only used when a(i,j)=1'; variable z 'objective'; equations obj erowsum(i) 'minimum number of ones in row' ecolsum(j) 'minimum number of ones in column' ; obj.. z =e= sum(a,x(a)); erowsum(i).. sum(a(i,j),x(i,j)) =g= r(i); ecolsum(j).. sum(a(i,j),x(i,j)) =g= c(j); model m /all/; *----------------------------------------------------------------------------------- * Reporting macro *----------------------------------------------------------------------------------- acronym Optimal; parameter results; $macro report(m,label) \ results('status',label) = m.modelstat; \ results('status',label)$(m.modelstat=1) = Optimal; \ results('obj',label) = z.l; \ results('frac',label) = sum((i,j)$(x.l(i,j)>0.001 and x.l(i,j)<0.999),1); \ results('time',label) = m.resusd; \ results('iter',label) = m.iterusd; \ results('nodes',label) = m.nodusd; *----------------------------------------------------------------------------------- * solve as MIP, LP and Network model *----------------------------------------------------------------------------------- m.solprint = %solprint.Silent%; option threads = 0, bratio = 1; solve m minimizing z using mip; report(m,'MIP') solve m minimizing z using rmip; report(m,'LP') m.optfile=1; solve m minimizing z using rmip; report(m,'NETWORK') display results; *----------------------------------------------------------------------------------- * network option file, for use with third solve *----------------------------------------------------------------------------------- $onecho > cplex.opt lpmethod 3 $offecho |
Appendix Highs open source solver
The open-source solver HighS seems to do a good job on this (using the interior point code):
C:\tmp\HiGHS\build\RELEASE\bin>highs --parallel on --solver ipm
\tmp\test\Free.mps Running HiGHS 1.5.0 [date:
2023-04-28, git hash: d4809eae7] Copyright (c) 2022 HiGHS under
MIT licence terms LP Free has 20501 rows; 999470 cols; 2998408
nonzeros Presolving model20500 rows,
999469 cols, 1998938 nonzeros 20500 rows, 999469 cols, 1998938
nonzeros Presolve : Reductions: rows
20500(-1); columns 999469(-1); elements 1998938(-999470) Solving the presolved LP IPX model has 20500 rows, 999469
columns and 1998938 nonzeros Input Number of variables: 999469 Number of free variables: 0 Number of constraints: 20500 Number of equality constraints: 0 Number of matrix entries: 1998938 Matrix range:
[1e+00, 1e+00] RHS range:
[1e+01, 1e+03] Objective range: [1e+00, 1e+00] Bounds range:
[1e+00, 1e+00] Preprocessing Dualized model: no Number of dense columns: 0 Range of scaling factors: [1.00e+00,
1.00e+00] IPX version 1.0 Interior Point Solve Iter
P.res D.res P.obj D.obj mu
Time 0 3.18e+00
1.61e+00 4.99746607e+05
-1.05662220e+06 5.73e+00 0s 1
1.87e-01 3.87e-01
5.02178430e+05 -7.59933312e+05
8.62e-01 0s 2
1.87e-07 3.87e-07
5.06231009e+05
2.51321995e+05 1.26e-01 0s 3
3.33e-08 2.08e-10 5.04967969e+05 5.04559804e+05 2.02e-04 1s 4
1.95e-09 4.44e-16
5.04726025e+05
5.04710964e+05 7.46e-06 1s 5
7.94e-11 4.44e-16
5.04711612e+05
5.04710999e+05 3.04e-07 1s 6
1.45e-11 4.44e-16
5.04711112e+05 5.04710964e+05 7.33e-08 1s 7*
4.84e-12 4.44e-16
5.04711002e+05
5.04711000e+05 1.08e-09 1s Constructing starting basis... Running crossover as requested Primal residual before push phase: 2.43e-07 Dual residual before push phase: 3.62e-14 Number of dual pushes required: 9953 Number of primal pushes required: 979469 440683 primal pushes remaining ( 436418
pivots) 235353 primal pushes remaining ( 624870
pivots) 37258 primal pushes remaining ( 803629
pivots) Summary Runtime:
17.90s Status interior point solve: optimal Status crossover: optimal objective value:
5.04711002e+05 interior solution primal residual
(abs/rel): 5.00e-12 / 4.63e-15 interior solution dual residual
(abs/rel): 4.44e-16 / 2.22e-16 interior solution objective gap
(abs/rel): 2.17e-03 / 4.30e-09 basic solution primal infeasibility: 0.00e+00 basic solution dual infeasibility: 0.00e+00 Ipx: IPM optimal Ipx: Crossover optimal Solving the original LP from the
solution after postsolve Model status
: Optimal IPM iterations: 7 Crossover iterations: 847053 Objective value :
5.0471100000e+05 HiGHS run time : 19.52
|
Somehow I can't run this directly from GAMS, as it needs a license. I don't know why.
Just I guess, but I suspect that the CPLEX does a bunch of presolve operations before it even attempts to solve the root LP. That would account, at least in part, for the time differential, and I think it might also account for the zero lower bound. Note that the log lines with zero lower bound also report zero (blank) iteration count. I'm guessing the zero lower bound came from some very basic inference by the presolver (the sum of a bunch of binary variables cannot go below 0) and predated any attempt to solve the root LP.
ReplyDeleteThis could support that:
DeleteParallel mode: deterministic, using up to 16 threads.
Parallel mode: deterministic, using up to 3 threads for parallel tasks at root LP.
Parallel mode: deterministic, using up to 13 threads for concurrent optimization
So 3 threads are devoted for doing things other than solving the LP (via concurrent optimization). But still. the LP takes only 17 seconds. They could check integrality in the 18th second and just kill running these expensive (5000 seconds) heuristics. Could be that these heuristics are usually cheap, so normally not a problem to wait for them to finish.
This comment has been removed by the author.
ReplyDeleteI have seen that Cplex seems to be stuck for a very long time after the root LP has been solved many times. What often helps me in those cases is to disable "auxiliary tasks" at the root node. I am not sure what those tasks are exactly. It was once told to me by Cplex support the they include "extra heuristics running at root node only". One can disable those tasks via GAMS/CPLEX option auxrootthreads=-1. The line
ReplyDeleteParallel mode: deterministic, using up to 3 threads for parallel tasks at root LP.
should then disappear and for me the MIP solves a bit faster than the LP (33 sec. vs 42 sec.).
Good to know. You may want to share this problem with the Cplex people as it is embarrasing enough to warrant some attention.
Delete