Yet Another Math Programming Consultant: A different coloring problem

In [1] the following problem is proposed:

Consider a complete (undirected) graph,
where each arc has a weight.
Given a collection of colors, color each node,
such that the sum of the weights of arcs with same-colored nodes is minimized.
We can assume some sparsity in the weights.

The user expects the number of nodes to be between 20 and 50, and there are just a small number of colors.

Minimize weights for the colored arcs

The picture shows a complete graph (all nodes have a direct arc to each other) with 10 nodes. The optimal coloring of nodes minimizes the sum of weights along the colored edges. An edge is colored if the two incident nodes have the same color.

Data

The first thing I always do when confronted with a problem like this is to form a data set. This is a very useful exercise: it forces you to get a reasonable understanding of the problem. More than just casual reading of some text.

Say we have $n=50$ nodes.
As we are dealing with an undirected graph, I'll create arcs $i-j$ only for $i \lt j$. So we have 1,225 arcs. The set of arcs is called $\color{darkblue}a_{i,j}$.
I form weights $\color{darkblue}w_{i,j}$ as follows:

With probability 0.75, the weight is zero
The other weights are uniformly distributed from $U(0,1)$

I use $k=5$ colors.

A zero weight does not add to the objective. Many zero weights are likely to make the problem easier. Very few or very many colors may also make the problem easier to solve.

Model

As in our graph-coloring model [2], we choose as central variable: \[\color{darkred}x_{n,c}=\begin{cases}1 & \text{if node $n$ gets assigned color $c$}\\ 0 & \text{otherwise}\end{cases}\] Using binary variables instead of a single dimensional integer variable makes life much easier. I also assume $\color{darkblue}w_{i,j}\ge 0$.

An MIQP model is surprisingly simple:

MIQP Model
\[\begin{align} \min & \sum_{\color{darkblue}a_{i,j},c} \color{darkblue}w_{i,j} \cdot \color{darkred}x_{i,c} \cdot \color{darkred}x_{j,c}\\ & \sum_c\color{darkred}x_{n,c}=1 && \forall n \\& \color{darkred}x_{n,c} \in \{0,1\} \end{align}\]

This model can be linearized as:

MIP Model
\[\begin{align} \min & \sum_{\color{darkblue}a_{i,j}} \color{darkblue}w_{i,j} \cdot \color{darkred}y_{i,j}\\ & \color{darkred}y_{i,j} \ge \color{darkred}x_{i,c} + \color{darkred}x_{j,c} - 1 && \forall \color{darkblue}a_{i,j},c\\ & \sum_c\color{darkred}x_{n,c}=1 && \forall n \\& \color{darkred}x_{n,c} \in \{0,1\} \\ & \color{darkred}y_{i,j} \in [0,1] \end{align}\]

The constraint \[ \color{darkred}y_{i,j} \ge \color{darkred}x_{i,c} + \color{darkred}x_{j,c} - 1 \>\>\> \forall \color{darkblue}a_{i,j},c\] implements the implication \[\color{darkred}x_{i,c} = \color{darkred}x_{j,c} = 1 \Rightarrow \color{darkred}y_{i,j}=1\] Note that the variable $\color{darkred}y_{i,j}$ is continuous between 0 and 1. It will be integer automatically.

A possible refinement is to introduce $\color{darkred}y_{i,j}$ only when $\color{darkblue}w_{i,j} \ne 0$. That will reduce the number of variables and constraints being generated. A good solver would also remove these in the presolver, so this is not essential.

Note that these models give the same optimal objective value.

Side note. In this reformulation I assumed $\color{darkblue}w_{i,j} \ge 0$. If that is not the case we can use a slightly more complex formulation: \[\begin{align} \min & \sum_{\color{darkblue}a_{i,j},c} \color{darkblue}w_{i,j} \cdot \color{darkred}y_{i,j,c} \\ & \color{darkred}y_{i,j,c} \le \color{darkred}x_{i,c} && \forall \color{darkblue}a_{i,j},c \> \text{if $\color{darkblue}w_{i,j}\lt 0$} \\ & \color{darkred}y_{i,j,c} \le \color{darkred}x_{j,c} && \forall \color{darkblue}a_{i,j},c \> \text{if $\color{darkblue}w_{i,j}\lt 0$} \\ & \color{darkred}y_{i,j,c} \ge \color{darkred}x_{i,c} + \color{darkred}x_{j,c} - 1 && \forall \color{darkblue}a_{i,j},c \> \text{if $\color{darkblue}w_{i,j}\gt 0$} \\ & \color{darkred}x_{n,c} \in \{0,1\} \\ & \color{darkred}y_{i,j,c} \in [0,1]\end{align}\]

There is some symmetry in the model that we can exploit. The actual colors don't matter, so we can fix node $n_1$ to have color $c_1$. Next we can say that node $n_2$ can assume colors $\{c_1,c_2\}$, etc. Of course, this is just one possible scheme; there are many others we can think of. These symmetry-breaking constraints can help a bit in more difficult cases.

Results

These models solve very fast. Cplex essentially solves the same problem as it linearizes the MIQP automatically.

The solution looks like:

---- 91 VARIABLE x.L assign color to node

color1 color2 color3 color4 color5

n1        1.000
n2                                                        1.000
n3        1.000
n4                                            1.000
n5                                1.000
n6                                1.000
n7        1.000
n8                    1.000
n9        1.000
n10                               1.000
n11                                                       1.000
n12                                                       1.000
n13                               1.000
n14                   1.000
n15       1.000
n16                                           1.000
n17                   1.000
n18                   1.000
n19                               1.000
n20       1.000
n21                               1.000
n22                                                       1.000
n23                   1.000
n24                                           1.000
n25                   1.000
n26                                                       1.000
n27                                           1.000
n28       1.000
n29                   1.000
n30                               1.000
n31                               1.000
n32                                                       1.000
n33                                           1.000
n34                                           1.000
n35                   1.000
n36                               1.000
n37                                           1.000
n38                                           1.000
n39                                                       1.000
n40                                                       1.000
n41       1.000
n42                                           1.000
n43                                           1.000
n44                   1.000
n45                   1.000
n46       1.000
n47                                                       1.000
n48                                                       1.000
n49                                           1.000
n50                               1.000

---- 91 VARIABLE cost.L = 0.579 number of arcs with same colored nodes

---- 96 PARAMETER wx obj contribution

n20 n26 n31 n41 n46 n48

n3        0.150                               0.017
n10                               0.046
n12                   0.073
n15                                                       0.101
n40                                                                   0.193

In the last table, I display the arcs that contribute to the objective. We see the model is very clever to select as many as possible arcs that have a zero weight when choosing the same color for nodes. There are just a few penalized arcs left. That is also making this problem easy to solve. The problem with $k=3$ colors is actually more difficult to solve. The solution for this $k=3$ problem is:

---- 91 VARIABLE x.L assign color to node

color1 color2 color3

---- 91 VARIABLE cost.L = 16.954 number of arcs with s
ame colored nodes

---- 96 PARAMETER wx obj contribution

n12 n13 n15 n16 n17 n20

n2                    0.749       0.270
n3                                            0.297       0.246
n5        0.323                   0.811                               0.415
n11                                                       0.455
n12                                                                   0.400

+ n22 n24 n25 n26 n29 n31

n3                                                        0.187
n4                                                                    0.774
n6        0.013
n11                               0.886
n13       0.047       0.837
n16                               0.129       0.311
n22                   0.543
n26                                                       0.445

+ n32 n33 n34 n35 n36 n37

n1                                0.314
n3                                                        0.544
n7                                0.265
n9                                0.815
n26                                           0.147
n28                   0.450                                           0.471
n29                                                       0.285
n30       0.594
n32                   0.025                                           0.293

+ n40 n41 n42 n44 n45 n46

n3                                            0.196
n14                                                                   0.873
n19                   0.043
n30                               0.352
n34       0.257
n36                                                       0.903
n45                                                                   0.120

+ n47 n49 n50

n15                               0.178
n19       0.790
n24       0.118
n39                               0.611
n40                   0.049
n41                               0.123

With just 3 colors there is more opportunity for the arcs to have the same color at each end. This model is more time-consuming: it takes about a minute to solve.

Cplex tidbits

The MIQP log shows a few interesting lines:

Tried aggregator 1 time.
MIP Presolve eliminated 0 rows and 1 columns.
MIP Presolve added 1926 rows and 963 columns.
Reduced MIP has 1976 rows, 1113 columns, and 4002 nonzeros.
Reduced MIP has 1113 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.09 sec. (0.34 ticks)
Probing time = 0.00 sec. (0.37 ticks)
Tried aggregator 1 time.
Detecting symmetries...
MIP Presolve eliminated 963 rows and 0 columns.
Reduced MIP has 1013 rows, 1113 columns, and 3039 nonzeros.
Reduced MIP has 1113 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (3.24 ticks)
Classifier predicts products in MIQP should be linearized.

The MIQP has been reformulated into a MIP by adding variables. Cplex decides to use this based not on human reasoning but on machine learning.

The log of our manually linearized model contains:

Tried aggregator 1 time.
MIP Presolve eliminated 2712 rows and 905 columns.
Reduced MIP has 1013 rows, 471 columns, and 3039 nonzeros.
Reduced MIP has 150 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (2.85 ticks)
Found incumbent of value 166.366290 after 0.00 sec. (3.79 ticks)
Probing time = 0.00 sec. (0.27 ticks)
Tried aggregator 1 time.
Detecting symmetries...
Reduced MIP has 1013 rows, 471 columns, and 3039 nonzeros.
Reduced MIP has 471 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.00 sec. (2.20 ticks)

It does not say this explicitly, but it converted our continuous variables $\color{darkred}y_{i,j}$ back to binary variables, ignoring my advice to use continuous variables. Also, note that my linear model is a bit more skimpy w.r.t. binary variables. Likely Cplex in the MIQP linearization is not as smart as I am: it introduces $\color{darkred}y_{i,j,c}$ to the model instead of my $\color{darkred}y_{i,j}$ variables. We can see this by the number of binary variables in the MIQP model after linearization.

--- MIQP status (101): integer optimal solution.
--- Cplex Time: 29.88sec (det. 22780.65 ticks)

--- MIP status (101): integer optimal solution.
--- Cplex Time: 50.83sec (det. 44974.47 ticks)

My linearization may be smart, but it is not beating Cplex's automatic reformulation.

Conclusion

This looks like a case where an off-the-shelf solver can solve a model under investigation without too much trouble. There is no real need to develop algorithms or heuristics. In general, my advice is that before contemplating writing algorithms yourself, always try to write an explicit optimization model and solve with standard solvers. If this works we are done. If not, start thinking about other approaches. Even in that case, the model results provide insight (think about documentation, debugging, and solution quality: is my heuristic working and delivering good solutions for smaller instances?). I almost never dive into using heuristics before developing a proper optimization model.

References

Find graph colouring that minimises sum of weights of edges that share vertex colours, https://stackoverflow.com/questions/70804766/find-graph-colouring-that-minimises-sum-of-weights-of-edges-that-share-vertex-co
Coloring the US county map, https://yetanothermathprogrammingconsultant.blogspot.com/2022/01/coloring-us-county-map.html

Appendix: GAMS Model

$ontext

Problem from
https://stackoverflow.com/questions/70804766/find-graph-colouring-that-minimises-sum-of-weights-of-edges-that-share-vertex-co

     Suppose we have a weighted complete graph with n vertices. We want to colour
     each vertex one of k colours. The cost of any colouring is the sum of the
     weights of all edges that connect vertices of the same colour.

     For my use case, n will be around 20-50. For k = 2, 3 brute force worked
     fine, but I would ideally like a solution for any k. It is also the case
     that the graphs I'm dealing with have reasonable sparsity, but again,
     I would ideally like to solve the general case.

References:
     https://yetanothermathprogrammingconsultant.blogspot.com/2022/01/a-different-coloring-problem.html

$offtext

*---------------------------------------------------------------
* random graph
*
* sparse undirected network:
* arcs only from i->j when i<j
*---------------------------------------------------------------

set
   n 'nodes' /n1*n50/
   a(n,n) 'arcs'
   c 'colors' /color1*color3/
;

alias(n,i,j);

a(i,j)$(ord(i)<ord(j)) = yes;

parameter w(n,n) 'weights';
w(a)$(uniform(0,1)<0.25) = uniform(0,1);
display w;

*---------------------------------------------------------------
* MIQP model
*---------------------------------------------------------------

binary variables
   x(n,c) 'assign color to node'
;

variable
   cost 'number of arcs with same colored nodes'
;

equations
   objective            'minimize cost'
   assign(n)            'every node should have exactly one color'
;

objective.. cost =e= sum((a(i,j),c),w(a)*x(i,c)*x(j,c));

assign(n).. sum(c, x(n,c)) =e= 1;

model color1 /all/;
option optcr=0, miqcp=cplex, threads=16;
solve color1 minimizing cost using miqcp;

*---------------------------------------------------------------
* MIP model
*---------------------------------------------------------------

binary variable y(i,j) 'nodes have the same color';

equations
   objective2    'linearized objective'
   ybound(i,j,c) 'x(i,c)=x(j,c)=1 => y(i,j)=1'
;

objective2..        cost =e= sum(a,w(a)*y(a));
ybound(a(i,j),c)$w(a).. y(a) =g= x(i,c)+x(j,c)-1;

model color2 /objective2,assign,ybound/;

* optionally relax y to continuous between 0 and 1
y.prior(a) = INF;

solve color2 minimizing cost using mip;

*---------------------------------------------------------------
* symmetry
* impose restrictions on first few nodes
*   node1 = color1
*   node2 = color1 or color2
*   node3 = color1 or color2 or color3
*---------------------------------------------------------------

set allowed(n,c) 'allowed colors for first card(c)-1 nodes';
allowed(n,c) = ord(c) <= ord(n) and ord(n) < card(c);
display allowed;

equation symmetry(n) 'break some symmetry';

symmetry(n)$(ord(n)<card(c)).. sum(allowed(n,c),x(n,c)) =e= 1;

model color3 /color1,symmetry/;
solve color3 minimizing cost using miqcp;

*---------------------------------------------------------------
* reporting
*---------------------------------------------------------------

display x.l,cost.l;

parameter wx 'obj contribution';
wx(a(i,j)) = w(a)*sum(c,x.l(i,c)*x.l(j,c));
display wx;

3 comments:

Rob PrattJanuary 26, 2022 at 2:08 PM
Nice post as always. The y[i,j] <= x[i,c] and y[i,j] <= x[j,c] constraints would instead need y[i,j,c] variables to work properly.
AnonymousJuly 14, 2024 at 9:54 PM
This is a cool problem. I set it up in python to see how an open source solver would perform. Im using pyoptinterface with Highs and its able to solve the k=5 case in about 120 seconds, and is unable to solve the k=3 case. I do believe the problem becomes much harder to solve (even with cplex) for higher probabilities of non empty w.

Tuesday, January 25, 2022

A different coloring problem