## Thursday, January 4, 2024

### String Art

In [1], a greyscale picture is approximated by strings (lines) between points around the image. Here, I will try something similar with a formal optimization model.

A greyscale image can be represented by an $$m\times n$$ matrix of floating point values between zero and one. A value of zero indicates a black pixel, and a value of one a white pixel. In the discussion below, when I use the term color, this means the greyscale value.

Here is a $$160\times 120$$ image. The pixels have values between 0 and 0.97. Let's call this matrix $$\color{darkblue}{\mathit{GreyScale}}_{i,j} \in [0,1]$$. Click on the picture to enlarge it a bit.

In my experiments, I just generate $$N=150$$ random points inside the image and draw possible lines between them. (The advantage of using mathematical models is that I am not bound to some physical reality: no physical pieces of strings are needed). The pixels that are covered by the lines we draw get a greyscale equal to the sum of the greyscales of the lines. (We assume additivity here: if a pixel is covered by several lines, we add up the contribution of each line. I think this may not be the same for real-world strings.) Here is a picture of the points and possible lines:

 N=150 randomly placed points

 All connecting lines

With $$N=150$$ points, we can draw $M=\frac{N(N+1)}{2}=11,325$ lines. This includes lines from a point to itself, i.e. singleton points. I am not sure if adding or discarding these $$N$$ singletons makes a difference.

It does not look that way in the above picture, but all these lines cover 16,870 pixels (that is 88% of all pixels), while 2,330 pixels are not touched by any line. One of the pixels is covered by 278 different lines!

It is difficult to see how selecting lines from this collection can give us something that resembles the original picture.

If each line has the same fixed color $$\color{darkblue}\gamma \in [0,1]$$, then we can write the optimization problem as an unconstrained MIQP (Mixed-Integer Quadratic Programming) problem:

Integer Optimization Model
\begin{align}\min&\sum_{i,j} \left( \sum_k \color{darkblue}\gamma \cdot \color{darkblue}{\mathit{LinePixels}}_{k,i,j}\cdot \color{darkred}x_{k} -\color{darkblue}{\mathit{GreyScale}}_{i,j} \right)^2 \\ & \color{darkred}x_k \in \{0,1\} \end{align}

Here, the binary variable $$\color{darkred}x_{k}$$ indicates whether we use line $$k$$. The (very sparse) boolean parameter $$\color{darkblue}{\mathit{LinePixels}}_{k,i,j}$$ tells us if pixel $$(i,j)$$ is covered by line $$k$$.

Observations:
• By adding up the greyscale levels for pixels covered by multiple lines, we can easily end up with a total greyscale value exceeding 1. We never can go below 0, so this is a bit asymmetric.
• We can skip pixels that are not covered by any line. This means using a subset of $$(i,j)$$.
• Large MIQPs are difficult to solve to optimality. This is no exception. Whether we use some linearization or solve as a quadratic model, this is not very easy.
• Solvers like Cplex may linearize this automatically. This involves forming all cross-products, and then linearize $$\color{darkred}x_{k_1}\cdot \color{darkred}x_{k_2}$$. This generates a ton of extra binary variables. In some cases, as a result, Cplex was running out of memory.
• Sometimes it is better to write: \begin{align}\min&\sum_{i,j} \color{darkred}e^2_{i,j}\\ & \color{darkred}e_{i,j}= \sum_k \color{darkblue}\gamma \cdot \color{darkblue}{\mathit{LinePixels}}_{k,i,j}\cdot \color{darkred}x_{k} -\color{darkblue}{\mathit{GreyScale}}_{i,j}\end{align} This makes the quadratic part (numerically) easier.
• Using a LAD (Least Absolute Deviation) objective also gives us a linear model: \begin{align}\min&\sum_{i,j} \left( \color{darkred}e^+_{i,j}+\color{darkred}e^-_{i,j}\right) \\ & \color{darkred}e^+_{i,j}-\color{darkred}e^-_{i,j}= \sum_k \color{darkblue}\gamma \cdot \color{darkblue}{\mathit{LinePixels}}_{k,i,j}\cdot \color{darkred}x_{k} -\color{darkblue}{\mathit{GreyScale}}_{i,j}\\ & \color{darkred}e^+_{i,j},\color{darkred}e^-_{i,j}\ge 0\end{align} Note that the automatic linearization discussed earlier, is exact. It will give the same solutions as the quadratic problem. Here, we change the problem (from a 2-norm to a 1-norm objective), so solutions will be somewhat different.
• We may not want to have pixel values above 1. An alternative objective could be: $\min \sum_{i,j} \left( \min\left\{1,\sum_k \color{darkblue}\gamma \cdot \color{darkblue}{\mathit{LinePixels}}_{k,i,j}\cdot \color{darkred}x_{k}\right\} -\color{darkblue}{\mathit{GreyScale}}_{i,j} \right)^2$ However, this complicates the model considerably. Something like $$\color{darkred}y=\min(\color{darkblue}a,\color{darkred}x)$$ requires a binary variable $$\color{darkred}\delta\in\{0,1\}$$: \begin{align}&\color{darkred}y\le\color{darkblue}a\\&\color{darkred}y\le\color{darkred}x\\&\color{darkred}y\ge\color{darkblue}a-\color{darkblue}M\cdot\color{darkred}\delta\\&\color{darkred}y\ge\color{darkred}x-\color{darkblue}M\cdot(1-\color{darkred}\delta)\end{align} unless there is something we can exploit. I don't see how to do this much smarter. If no good bounds are available to determine $$\color{darkblue}M$$, we can use a SOS1 set.
• Here, we used that colors are additive. A different approach could be to make the observed color the maximum of the underlying line colors. I.e. $\min\sum_{i,j} \left( \max_k \left\{ \color{darkblue}\gamma \cdot \color{darkblue}{\mathit{LinePixels}}_{k,i,j}\cdot \color{darkred}x_{k}\right\} -\color{darkblue}{\mathit{GreyScale}}_{i,j} \right)^2$ This again (for the same reasons as the previous bullet) is making the model much more difficult.
• A possible extension is to make $$\color{darkblue}\gamma$$ a variable: let the model decide the best color for all selected lines. Note that  $$\color{darkred}\gamma$$  is not indexed by $$k$$ here. So, we have a single color for all lines. Below, I discuss the case where we allow $$\color{darkred}\gamma_k$$. I.e., a different color for each line. That can be combined with $$\color{darkred}x_{k}$$.
• In the discussion about the mathematics, [1] drops the integrality conditions and also the bounds on $$\color{darkred}x_{k}$$. Removing the bounds does not make much sense to me. Also, in [1], lines are thicker and use some kind of anti-aliasing scheme. I use very thin lines without anti-aliasing.

If we want to allow that each line has its own color (determined by the model), then the problem actually becomes much easier:

Continuous Optimization Model
\begin{align}\min&\sum_{i,j} \left( \sum_k \color{darkblue}{\mathit{LinePixels}}_{k,i,j} \cdot \color{darkred}x_{k} -\color{darkblue}{\mathit{GreyScale}}_{i,j} \right)^2 \\ & \color{darkred}x_k \in [0,1] \end{align}

Now $$\color{darkred}x_k$$ does not just turn line $$k$$ on, but also determines its color. The resulting model is an unconstrained QP (Quadratic Programming) model (but with bounds on $$\color{darkred}x$$).

Solving this leads to the following somewhat creepy picture:

 Solution: best fit

All pixels in the solution with a color value larger than one were reset to 1.0.  I.e. the plotted pixels have a value: $v_{i,j} := \min\left\{1,\sum_k \color{darkblue}{\mathit{LinePixels}}_{k,i,j} \cdot \color{darkred}x^*_{k}\right\}$

Einstein's wild hair is certainly a problem for this model. But we can somewhat recognize his face.

### Conclusion

We can approximate a greyscale image by drawing lines between relatively few points. Selecting lines of the same exogenous color (greyscale value) can be modeled as a large, difficult MIQP model. Reinterpreting the problem, allowing each line to have its own endogenous color, leads to a much simpler QP model. The result is a picture that somewhat resembles the original.

The mathematical review in [1] is a bit less rigorous than what I show here. I stay closer to the original optimization problem. I don't sweep the integer restrictions and bounds on the decision variables under the rug.

### References

1. The Mathematics of String Art, https://www.youtube.com/watch?v=WGccIFf6MF8

### Appendix: GAMS Model

 *----------------------------------------------------------------------------* data*---------------------------------------------------------------------------- $include data5.inc$ontext Summary:    data5.inc   pixels: (240, 180)  nodes:  150  lines:  11325   sets    i 'rows'    j 'columns'    k 'nodes'    lines(k,k) 'line node->node'    linepixels(k,k,i,j) 'pixels affected by line k1->k2'  parameter    greyscale(i,j) 'pixel values of greyscale image' \$offtext  set pixels(i,j) /#i.#j/; parameter sizes(*);sizes('lines') = card(lines);sizes('linepixels') = card(linepixels);display sizes; *----------------------------------------------------------------------------* pixels not covered by any line?*---------------------------------------------------------------------------- sets  covpix(i,j) 'pixels covered by some line'  noncovpix(i,j) 'pixels not covered by any line'; alias (k,k1,k2);covpix(i,j) = sum(linepixels(k1,k2,i,j),1);noncovpix(i,j) = not covpix(i,j); sizes('covpix') = card(covpix);sizes('noncovpix') = card(noncovpix);display sizes; *----------------------------------------------------------------------------* model m: unconstrained QP*---------------------------------------------------------------------------- scalar gamma 'color factor' /1.0/; option miqcp=cplex,rmiqcp=cplex,limrow=0,limcol=0,threads=8; variable z 'objective';binary variable x(k,k) 'select lines'; equation obj;obj.. z =e= sum(covpix, sqr(gamma*sum(lines,linepixels(lines,covpix)*x(lines)) - greyscale(covpix))); model m /obj/;solve m minimizing z using rmiqcp;

This model is skipping pixels that are not covered by any line. That is not really needed.