As we are talking about binary variables, \(z = x \cdot y\) can be interpreted as \(z = x \textbf{ and } y\). We can resort to two standard formulations:
Formulation 1 | Formulation 2 |
---|---|
\[\begin{align} & \color{darkred} z \le \color{darkred}x\\ & \color{darkred}z \le \color{darkred}y\\ & \color{darkred}z \ge \color{darkred}x+\color{darkred}y-1\\ & \color{darkred}x,\color{darkred}y \in \{0,1\} \\ & \color{darkred}z \in [0,1] \end{align}\] | \[\begin{align} & \color{darkred} z \le \frac{\color{darkred}x+\color{darkred}y}{2}\\ & \color{darkred} z \ge \color{darkred}x+\color{darkred}y-1\\ & \color{darkred}x,\color{darkred}y,\color{darkred}z \in \{0,1\} \\ \end{align}\] |
I always use the first form. But I quite often see the second formulation being used (e.g. [1]).
The second form can be interpreted as an aggregation of the first one by adding up \(z\le x\) and \(z \le y\).
There is a good reason to use the first form: it is tighter. For instance, the fractional value \((x,y) = (0, 1/2)\) yields \(z=0\) in the first formulation, but would allow \(z=0.25\) in the second formulation. The first formulation is so tight, we can relax \(z\in\{0,1\}\) to \(z\in[0,1]\).
My attempt to show graphically the difference:
|
|
Obvious formulation 1 seems to be the winner. But when I try this out on a model [2] I see:
Size | Model 1 Seconds/nodes | Model 2 Seconds/nodes |
---|---|---|
\(n=20,k=10\) | 2.9 / 3117 | 1.8 / 3018 |
\(n=25,k=5\) | 12 / 6245 | 11 / 7043 |
\(n=30,k=4\) | 15 / 5263 | 13 / 4471 |
Well, this does not seem to prove my point. My preferred formulation is actually slightly underperforming.
Let's look at a different solver to see if that proves my point.
Size | Model 1 Seconds/nodes | Model 2 Seconds/nodes |
---|---|---|
\(n=20,k=10\) | 46 / 3082 | 1623 / 279364 |
\(n=25,k=5\) | 105 / 6860 | 1823 / 87396 |
\(n=30,k=4\) | 144 / 12008 | 213 / 11336 |
These results look quite different. First, obviously CBC is slower than Cplex. No surprise there. But also we see formulation 1 is much better than formulation 2. This is more like I expected beforehand. I suspect the cuts produced by Cplex eliminated the advantage of formulation 1. We see that more often: some modeling tricks are becoming less important as solvers are getting smarter.
Notes:
- The meaning of \(n\) and \(k\) for the size of the problem is explained in [2].
- The variables \(z\) were declared as binary both in models 1 and 2.
Thanks for summing this very common trick up concisely. Looks like you have a typo in the second formula line, though. It has to be "x,y,z \in {0,1}".
ReplyDeleteFixed. Thanks.
Delete