In [1], I discussed coloring all US counties (3,221 of them) such that neighboring counties have a different color. The famous 4-color theorem says we can do this with just 4 colors. After some adventures with the data, I was indeed able to produce maps with 4 colors.

Model 1: US Counties colored with minimum number of colors |

An interesting part is the **color count** table. That table shows the number of counties colored by each color. This is a bit unbalanced. So an obvious question is:

Can we find a 4-coloring that has a more equal use of colors?

To recap, the original graph coloring model looked like:

Model 1: Minimize number of colors model |
---|

\[\begin{align} \min & \sum_c \color{darkred}u_c \\ & \sum_c \color{darkred}x_{n,c}=1 &&\forall n \\ & \color{darkred}x_{i,c}+\color{darkred}x_{j,c} \le \color{darkred}u_c && \forall \color{darkblue}{\mathit{arc}}_{i,j}, c \\ & \color{darkred}u_c \le \color{darkred}u_{c-1} && \forall c\gt 1 \\ & \color{darkred}x_{n,\color{darkgreen}{\mathit{color1}}} = 1 && \text{for all isolated counties}\\ & \color{darkred}x_{n,c} \in \{0,1\} \\ & \color{darkred}u_c \in \{0,1\} \end{align} \] |

This model delivered 4 as the optimal solution. Now we want to minimize the spread in the color count. We can reuse the original model, and convert the objective into a constraint: \[\sum_c \color{darkred}u_c=4\] but another, probably simpler approach is just to restrict the set \(c\) of colors to just 4 elements. With this, a model can be:

Model 2: Minimize spread in color count |
---|

\[\begin{align} \min\> & \color{darkred}{\mathit{range}} = \color{darkred}{\mathit{maxCount}} -\color{darkred}{\mathit{minCount}}\\ &\color{darkred}{\mathit{maxCount}} \ge \color{darkred}{\mathit{count}}_c && \forall c\\ &\color{darkred}{\mathit{minCount}} \le \color{darkred}{\mathit{count}}_c && \forall c\\ &\color{darkred}{\mathit{count}}_c = \sum_n\color{darkred}x_{n,c} && \forall c\\ & \sum_c \color{darkred}x_{n,c}=1 &&\forall n \\ & \color{darkred}x_{i,c}+\color{darkred}x_{j,c} \le 1 && \forall \color{darkblue}{\mathit{arc}}_{i,j}, c \\ & \color{darkred}x_{n,c} \in \{0,1\} \\ & c = \{\color{darkgreen}{col1},\color{darkgreen}{col2},\color{darkgreen}{col3},\color{darkgreen}{col4}\} \end{align} \] |

- The variable \(\color{darkred}u_c\) is dropped. We know it is 1 if set \(c\) has just 4 colors.
- Isolated counties now become interesting again: they play a role. We no longer fix them to have color 1.

#### Symmetry breaking

Alabama counties |

#### The worst spread

**minimize**by

**maximize**. Unfortunately, this is not the case. Finding the max and the min count is now more difficult: we need extra binary variables. The idea is to use the following formulation: \[\color{darkred}y=\max_i \color{darkred}x_i \iff \begin{aligned} & \forall i: \color{darkred}y\ge \color{darkred}x_i\\ & \exists i: \color{darkred}y\le \color{darkred}x_i\end{aligned} \iff \begin{aligned} & \color{darkred}y\ge \color{darkred}x_i &&\forall i \\ & \color{darkred}y\le \color{darkred}x_i - \color{darkblue}M (1-\color{darkred}\delta_i) &&\forall i \\ & \sum_i \color{darkred}\delta_i=1 \\ & \color{darkred}\delta_i \in \{0,1\}\end{aligned} \]

Model 3a: Maximize spread in color count |
---|

\[\begin{align} \max\> & \color{darkred}{\mathit{range}} = \color{darkred}{\mathit{maxCount}} -\color{darkred}{\mathit{minCount}}\\ &\color{darkred}{\mathit{maxCount}} \ge \color{darkred}{\mathit{count}}_c && \forall c \\ &\color{darkred}{\mathit{maxCount}} \le \color{darkred}{\mathit{count}}_c + \color{dakblue}M \color{darkred}\delta^{\mathit{max}}_c&&\forall c \\ & \sum_c \color{darkred}\delta^{\mathit{max}}_c = {\bf{card}}(c)-1 \\ &\color{darkred}{\mathit{minCount}} \le \color{darkred}{\mathit{count}}_c && \forall c \\ &\color{darkred}{\mathit{minCount}} \ge \color{darkred}{\mathit{count}}_c - \color{dakblue}M \color{darkred}\delta^{\mathit{min}}_c&&\forall c \\ & \sum_c \color{darkred}\delta^{\mathit{min}}_c = {\bf{card}}(c)-1 \\ &\color{darkred}{\mathit{count}}_c = \sum_n\color{darkred}x_{n,c} && \forall c\\ & \sum_c \color{darkred}x_{n,c}=1 &&\forall n \\ & \color{darkred}x_{i,c}+\color{darkred}x_{j,c} \le 1 && \forall \color{darkblue}{\mathit{arc}}_{i,j}, c \\ & \color{darkred}x_{n,c} \in \{0,1\} \\ & \color{darkred}\delta^{\mathit{min}}_c, \color{darkred}\delta^{\mathit{max}}_c \in \{0,1\} \\ & c = \{\color{darkgreen}{col1},\color{darkgreen}{col2},\color{darkgreen}{col3},\color{darkgreen}{col4}\} \end{align} \] |

- We can set \(\color{darkblue}M := {\bf{card}}(c)\).
- We can use the solution of model 1 as incumbent using the
**mipstart**option. - This is a very difficult model to solve.

Model 3b: Maximize spread in color count |
---|

\[\begin{align} \max\> & \color{darkred}{\mathit{range}} = \color{darkred}{\mathit{count}}_{\color{darkgreen}{\mathit{col1}}} - \color{darkred}{\mathit{count}}_{\color{darkgreen}{\mathit{col4}}}\\ &\color{darkred}{\mathit{count}}_c = \sum_n\color{darkred}x_{n,c} && \forall c\\ & \sum_c \color{darkred}x_{n,c}=1 &&\forall n \\ & \color{darkred}x_{i,c}+\color{darkred}x_{j,c} \le 1 && \forall \color{darkblue}{\mathit{arc}}_{i,j}, c \\ &\color{darkred}{\mathit{count}}_c \le \color{darkred}{\mathit{count}}_{c-1} && \forall c \gt 1 \\ & \color{darkred}x_{n,c} \in \{0,1\} \\ & c = \{\color{darkgreen}{col1},\color{darkgreen}{col2},\color{darkgreen}{col3},\color{darkgreen}{col4}\} \end{align} \] |

#### Conclusion

#### References

- http://yetanothermathprogrammingconsultant.blogspot.com/2022/01/coloring-us-county-map.html
- Data-Driven Documents, https://d3js.org/.

#### Appendix: GAMS Model 2

$ontext Color
map with US countiesFiles:ColorCounties.gms
(this file)data.incgeojson-counties-fips.js$offtext *---------------------------------------------------------------* data*---------------------------------------------------------------setsn 'nodes: counties (fips code)' a(n,n) 'arcs: adjacency -- upper-triangular part only' c 'colors' / col1 '#96CEB4'col2 '#FFEEAD'col3 '#D9534F'col4 '#FFAD60'/ county 'county names' map(n,county<) ; $include data.inc * fix data wrt Glades County, FL* set one of these to "no"a('12043','12085') = no;*a('12043','12099') = no;scalar numCounties 'number of counties in map';numCounties = card(n);display numCounties;alias(n,i,j);* we have a few isolated counties (Hawaii)set isolated(n) 'county without neighbors';isolated(n) = sum(j$(a(n,j) or a(j,n)),1)=0;display isolated;*---------------------------------------------------------------* data checks*---------------------------------------------------------------abort$(card(map)<>numCounties) "check map";abort$sum(a(i,j)$(ord(i)>=ord(j)),1) "a(i,j) is not strictly upper-triangular";*---------------------------------------------------------------* optimization model: minimize number of colors needed*---------------------------------------------------------------binary variablesx(n,c) 'assign color to node' ; variablenumColors 'number of colors needed' range maxCount minCount Count(c) 'number of counties colored with c' ; equationsobjective 'make number of times each color is used as equal as possible' assign(n) 'every node should have exactly one color' notSameColor(i,j,c) 'adjacent vertices cannot have the same color' boundMax boundMin counting ; objective.. range =e= maxCount-minCount; boundMax(c).. maxCount =g= count(c); boundMin(c).. minCount =l= count(c); counting(c).. Count(c) =e= sum(n, x(n,c));assign(n).. sum(c, x(n,c)) =e= 1;notSameColor(a(i,j),c).. x(i,c)+x(j,c) =l= 1; *---------------------------------------------------------------* fix a few counties by looking at the map* 01001 .
"Autauga County, AL" ->
color 1* neigbor is: Dallas County:* 01047 .
"Dallas County, AL" ->
color 2* a neigbor to both is: Chilton* 01021 .
"Chilton County, AL" ->
color 3*---------------------------------------------------------------x.fx('01001','col1') = 1; x.fx('01047','col2') = 1; x.fx('01021','col3') = 1; model color /all/;option optcr=0,
threads=16;solve color
minimizing range using mip;display x.l;parameter colcount(c) 'number of counties with
given color';colcount(c) = sum(n$(x.l(n,c)>0.5), 1);display colcount; |

## No comments:

## Post a Comment