Tuesday, February 25, 2025

Small MIP, proving optimality is difficult

This is a simple, smallish MIP model that is difficult to solve to proven optimality. The problem is stated as [1]:

I have grid of dimensions H and W of boolean variables. The only constraint is that if a variable is false then at least one of the adjacent variables (top, right, left, bottom, diagonals don't count) must be true. The goal is to minimize the number of true values in the grid.

A high-level mathematical representation of this can be:

High-level model
\[\begin{align}\min&\sum_{i,j}\color{darkred}x_{i,j}\\ & \color{darkred}x_{i,j}=0 \implies \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} \ge 1 \\ & \color{darkred}x_{i,j}\in \{0,1\} \end{align}\]

The implication can be written as a simple linear constraint: \[ \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} \ge 1 - \color{darkred}x_{i,j}\] or \[ \color{darkred}x_{i,j} + \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} \ge 1 \] We can also find the last constraint directly from the description. Just a different way of looking at the problem statement: at least one from \(\color{darkred}x_{i,j}\) and its direct neighbors should be 1.

One issue that always warrants attention is: how to deal with the border? In this case we can assume \(\color{darkred}x_{i-1,j}=0\) for \(i=1\). In other words, all \(\color{darkred}x_{i,j}\) outside the board are zero. This is exactly how GAMS behaves. GAMS will return a zero when addressing a variable or parameter with leads or lags outside its domain. So, we can simply use the following GAMS implementation:

set
i 'rows' /row1*row32/
j 'columns' /col1*col32/
;

variable z 'objective';
binary variable x(i,j) 'cells';

equations
obj 'objective'
eq(i,j) 'implication'
;

obj.. z =e= sum((i,j),x(i,j));
eq(i,j).. x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) =g= 1;

model m /all/;
solve m minimizing z using mip;

option x:0:1:1;
display x.l;

A solution with objective 227 can be found rather quickly, but proving optimality is more difficult. The solution of 227 was found after 65 seconds, but I could not prove this was the optimum before hitting my time limit of 1 hour. 



A model with just 1024 binary variables can be difficult even for high-end solvers. Of course \(2^{1024}\) is not that small:



Improvement

One way to improve on the performance is to notice that the interior is rather regular and that the behavior near the border is more unpredictable. That makes sense as we assume all values outside the board are zero. That makes the constraint more difficult to obey on or near the border. My idea is to write the equations as:

\[\begin{align} & \color{darkred}x_{i,j} + \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} = 1 && \text{for $(i,j)$ in the interior part} \\ &  \color{darkred}x_{i,j} + \color{darkred}x_{i-1,j} + \color{darkred}x_{i+1,j} + \color{darkred}x_{i,j-1} + \color{darkred}x_{i,j+1} \ge 1 && \text{for $(i,j)$ near the border}\end{align}\]

I define here the border region as the first and last two rows and columns. The depth of the border region needed to be two levels. When using a skinny border of just one cell, the model was infeasible. This version of the model is solved very fast (0.6 seconds; zero nodes were needed and only 141 simplex iterations, with proven optimality). It is quite remarkable how much impact this new formulation has on the performance. We are talking about a > 6000x performance increase. The picture becomes: 


We could even do the inner portion "by hand" (i.e. by some code), and let the solver worry about the border only. Obviously, inspiration for this approach came from looking at the pictures we produced.

Complete GAMS Model


$onText

https://stackoverflow.com/questions/78987960/suggestion-on-which-constraint-to-adds-to-optimize-minizincs-model

Problem definition:

I have grid of dimensions H and W of boolean variables. The only constraint is that if
a variable is false then at least one of the adjacent variables (top, right, left, bottom,
diagonals don't count) must be true. The goal is to minimize the number of true values in the grid.

I.e.
x(i,j) = 0 => x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) >= 1
This can be written as:

x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) >= 1 - x(i,j)
or

x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) >= 1

$offText

*------------------------------------------------
* problem size
*------------------------------------------------

set
i 'rows' /row1*row32/
j 'columns' /col1*col32/
;

*------------------------------------------------
* MIP model (difficult to prove optimality)
*------------------------------------------------

variable z 'objective';
binary variable x(i,j) 'cells';

equations
obj 'objective'
eq(i,j) 'implication'
;

obj.. z =e= sum((i,j),x(i,j));
eq(i,j).. x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) =g= 1;

model m /all/;
option reslim = 100;
solve m minimizing z using mip;

option x:0:0:7;
display x.l;

*------------------------------------------------
* improved MIP model (solves very fast)
*------------------------------------------------

set
interior(i,j)
border(i,j)
;

interior(i,j) = ord(i) > 2 and ord(i) < card(i)-1 and ord(j) > 2 and ord(j) < card(j)-1;
border(i,j) = not interior(i,j);

equations
eq1(i,j) 'interior constraints'
eq2(i,j) 'border constraints'
;

eq1(interior(i,j)).. x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) =e= 1;
eq2(border(i,j)).. x(i,j) + x(i-1,j) + x(i+1,j) + x(i,j-1) + x(i,j+1) =g= 1;

model m2 /obj,eq1,eq2/;
solve m2 minimizing z using mip;

display x.l;

*------------------------------------------------
* Visualization
*------------------------------------------------

$set html board.html

file f /%html%/;
put f;

put '<h1>Board Problem</h1>'/;

put '<table>'
'<tr><td>Board size<td><td><pre>',card(i):0:0,'x',card(j):0:0,'</pre></td></tr>'
'<tr><td>Binary variables<td><td><pre>',card(x):0:0,'</pre></td></tr>'
'<tr><td>Objective<td><td><pre>',z.l:0:0,'</pre></td></tr>'
'</table><p>'/;

put '<svg height="500" width="500" viewbox="0 0 35 35">'/;
loop((i,j),
put '<rect x="',ord(j):0:0,'" y="',ord(i):0:0,'" width="1" height="1" ';
put$(x.l(i,j)<0.5) 'fill="white" ';
put$(x.l(i,j)>0.5) 'fill="lightblue" ';
put 'stroke-width="0.01" stroke="black" />'/;
);
put '</svg>'/;

putclose;

executetool 'win32.ShellExecute "%html%"';


Known solutions


Rob Pratt pointed me to the series listed in https://oeis.org/A104519 with objective values. I initially misread this, but our 227 solution can be found as a(34).

References

6 comments:

  1. This is the domination number of an n by n grid graph, and the values are known for all n: https://oeis.org/A104519

    ReplyDelete
    Replies
    1. Thanks. Not 100% the same as my model, but very close.

      Delete
    2. Hmm, I don't see any difference. For domination number, the problem is to minimize the number of selected nodes subject to a constraint for each node that either the node or at least one of its neighbors is selected. That yields the same formulation as your problem.

      Delete
    3. Yes, 3x3 corresponds to a(3+2) = a(5) = 3 in OEIS: "COMMENTS a(n+2) is also the domination number (size of minimum dominating set) in an n X n grid graph (Alanko et al.)."

      Delete
    4. Ah, I missed "a(n+2) is also the domination number".

      Delete