In [1] an optimization model is proposed for the following problem:

Using a list of 5-letter words, try to find a set of 5 words such that each letter is used at most once.

### Data

The list of 5-letter words is from the wordle [2] game. Words with duplicate letters are removed. A partial list is:

---- 1035 SETwwords: large list of 5-letter wordsabdom, abend, abets, abhor, abide, abies, abyes, abilo, abime, abysm, abled, abler, ables ablet, ablow, abmho, abner, abnet, abode, abody, abohm, aboil, abord, abort, abote, about above, abret, abrim, abrin, abris, abrus, absey, absit, abstr, abune, abuse, abush, abuts acedy, acerb, ached, achen, acher, aches, achor, acidy, acids, acier, acies, acyls, acing ackey, acker, aclys, acmes, acned, acnes, acoin, acold, acone, acorn, acost, acoup, acred acres, acrid, acryl, acron, acrux, acted, actin, acton, actor, actos, actus, acute, adcon

The total number of 5-letter words in this list is: 10,175.

For our model, we need to know if character \(c\) is present in word \(w\). For this, we form a 2-dimensional set \(\color{darkblue}{\mathit{inWord}}_{w,c}\). A partial view looks like this:

---- 1053 SETinWordcharacter c is in word wa b c d e f g h i j abdom YES YES YES abend YES YES YES YES abets YES YES YES abhor YES YES YES abide YES YES YES YES YES abies YES YES YES YES

### Model

A simple model to find a maximum subset of words that have unique letters can be:

Covering Model |
---|

\[\begin{align}\max& \sum_w \color{darkred}x_w \\ & \sum_{w|\color{darkblue}{\mathit{inWord}}(w,c)} \color{darkred}x_w \le 1 && \forall c \\ & \color{darkred}x_w \in \{0,1\} \end{align}\] |

---- 1082 VARIABLEx.Lselect wordbrick 1.000, fldxt 1.000, nymph 1.000, quags 1.000, vejoz 1.000 ---- 1082 VARIABLEcount.L = 5.000number of selected words

**fldxt**(an abbreviation for fluid extract).

### All solutions

**solution pool**for that. This is a technology present in most of the more advanced MIP solvers, to find and report more than one solution. When I do that, we see that the total number of 5-word solutions is 831. The solution time to find all these solutions is about a minute (78 seconds).

According to Parker, executing the Parker algorithm on a laptop took about a month.

### References

- Christopher D. Long, https://github.com/octonion/puzzles/tree/baba3c605ceb83155fd427c00f8812290a66bb7e/twitter/parker
- Wordle, https://www.nytimes.com/games/wordle/index.html
- K best solutions for an assignment problem, https://yetanothermathprogrammingconsultant.blogspot.com/2018/04/k-best-solutions-for-assignment-problem.html
- Benjamin Paassen, https://gitlab.com/bpaassen/five_clique

### Appendix: GAMS model

$ontext $offtext $set inc words.inc $set url https://raw.githubusercontent.com/aopt/OptimizationModels/main/Wordle/%inc% *-------------------------------------------------* 5-letter Words*-------------------------------------------------* download file$if not exist %inc% $call curl -O %url% setw 'words: large list of 5-letter
words' /$include %inc% / ; display w;scalar numWords 'number of words in our collection';numWords = card(w);display numWords;*-------------------------------------------------* Derived data* Characters in words*-------------------------------------------------setsc 'characters' /a*z/p 'position' /1*5/inWord(w,c) 'character c is in word w'; * ord(string,n) = ascii number of string[n]inWord(w,c) = sum(p$(ord(w.tl,p.val)=ord(c.tl)), 1);display inWord;*-------------------------------------------------* Optimization Model*-------------------------------------------------binary variable x(w) 'select word';variable count 'number of selected words';equationscounting 'number of
different words with unique characters'cover(c) 'number of
selected words containing char c'; counting.. count =e= sum(w, x(w));cover(c).. sum(inWord(w,c), x(w)) =l= 1;model m /all/;solve m maximizing
count using mip;display x.l,count.l;* remove if you want to see all solutions$stop *-------------------------------------------------* How many solutions?*-------------------------------------------------$onecho > cplex.opt SolnPoolAGap = 0.5SolnPoolIntensity = 4PopulateLim = 10000SolnPoolPop = 2solnpoolmerge solutions.gdx$offecho m.optfile=1; option threads=0;* not strictly needed, but seems to helpcount.fx = round(count.l); Solve m using mip
maximizing count ;setindex0 'superset of solution indices' /soln_m_p1*soln_m_p10000/index(index0) 'used solution indices'; parameterxall(index0,w) 'all solutions'; execute_load "solutions.gdx",index,xall=x;display xall;scalar numSolutions
'number of solutions';numSolutions = card(index);display
numSolutions; |

## No comments:

## Post a Comment