Wednesday, August 5, 2020

Concatenate numbers is not easy in a MIP model

From [1]: 

Here is the problem: 
  • There are 4 integer variables from within a limited set = [1, 10, 20, 40, 100, 200]
  • When all 4 is concatenated (say 1 || 1 || 10 || 20 = 111020) the result can be divisible by 13. 
I tried Python's pyomo library but can not find a solution as there is no modulus operation to decide if the number can be divisible by 13.

There is no objective in this problem, so let's invent a simple one: find the smallest concatenated number that is divisible by 13. 

Discussion


There are two issues here:
  • Requiring a number is divisible by 13. This is easy. Just use an integer variable \(\mathit{mul13}\in \{0,1,2,3,\dots\}\) and impose the constraint \[ x = 13 \cdot \mathit{mul13}\] This will assure that \(x\) is a multiple of 13.
  • Concatenate numbers as if they are strings. This is not so easy. MIP solvers do not know anything about strings. So we have to find a work-around. 

Let's work from the back. 10 || 20 can be interpreted as: \[20 + 10^{\mathit{numDigits}} \cdot 10\] where \(\mathit{numDigits}\) is the number of digits of 20 (i.e., \(\mathit{numDigits}=2\)). So, the 10 is actually worth 1000. With this idea, we can setup the recursion. Let's call each substring a "piece", indexed by \(p\). So, we have: \[\begin{align} & \mathit{cumDigits}_p = \begin{cases} \mathit{cumDigits}_{p+1} + \mathit{numDigits}_p & p \in \{1,2,3,4\} \\ 0 & p \gt 4 \end{cases}\\ & \mathit{cumValue}_p = \begin{cases} \mathit{cumValue}_{p+1} + 10^{\mathit{cumDigits}_{p+1}} \cdot \mathit{pieceValue}_p  &p \in \{1,2,3,4\} \\ 0 &p \gt 4 \end{cases} \end{align}\]  where \[\begin{align} & \mathit{numDigits}_p  &&\text{number of digits in $p$-th piece} \\  & \mathit{cumDigits}_p && \text{cumulative number of digits in pieces $p,p+1,\dots$}\\ &\mathit{pieceValue}_p && \text{value of piece $p$}\\ & \mathit{cumValue}_p && \text{cumulative value of pieces $p,p+1,\dots$} \end{align}\] 

The values of  \( \mathit{numDigits}_p\) and \(\mathit{value}_p \) depend on what value we have chosen for piece \(p\). 

Obviously, this is quite convoluted. Furthermore, recursion for the cumulative value calculation is non-linear. Let's see if we can put this together in an MINLP model.

Data


The data is organized as follows: 

----     14 SET p  pieces

piece1,    piece2,    piece3,    piece4


----     14 SET s  selection from values

select1,    select2,    select3,    select4,    select5,    select6


----     14 PARAMETER value  selectable values

select1   1,    select2  10,    select3  20,    select4  40,    select5 100,    select6 200


----     14 PARAMETER digit  number of digits

select1 1,    select2 2,    select3 2,    select4 2,    select5 3,    select6 3


The number of digits for each value is calculated as: \[\mathit{digit}_p = \Bigl \lfloor{ 1 + \log_{10} \mathit{value}_p }\Bigr \rfloor \] where the funny brackets indicate the floor function. 
  

Mathematical Model


A complete model can look like:

MINLP Model
\[ \begin{align} \min\>& \color{darkred} z \\ & \sum_s \color{darkred}{\mathit{select}}_{p,s} = 1 && \forall p && \text{select exactly one value for each piece $p$}&& \text{(linear)}\\ & \color{darkred}{\mathit{pieceValue}}_p = \sum_s \color{darkblue}{\mathit{value}}_s \cdot \color{darkred}{\mathit{select}}_{p,s} && \forall p && \text{selected value} && \text{(linear)}\\ & \color{darkred}{\mathit{numDigits}}_p = \sum_s \color{darkblue}{\mathit{digit}}_s \cdot \color{darkred}{\mathit{select}}_{p,s} && \forall p && \text{associated number of digits}&& \text{(linear)}\\ & \color{darkred}{\mathit{cumDigits}}_p = \color{darkred}{\mathit{cumDigits}}_{p+1} + \color{darkred}{\mathit{numDigits}}_p && \forall p && \text{cumulative number of digits}&& \text{(linear)}\\ & \color{darkred}{\mathit{cumValue}}_p = \color{darkred}{\mathit{cumValue}}_{p+1} + 10^{\color{darkred}{\mathit{cumDigits}}_{p+1}} \cdot \color{darkred}{\mathit{pieceValue}}_p && \forall p &&\text{cumulative value} && \text{(non-linear)} \\ & \color{darkred} z = \color{darkred}{\mathit{cumValue}}_{\color{darkgreen}{\mathit{piece1}}} && && \text{objective variable} && \text{(linear)}\\ & \color{darkred} z = 13 \cdot \color{darkred}{\mathit{mul13}} && && \text{must be multiple of 13} && \text{(linear)}\\ & \color{darkred}{\mathit{select}}_{p,s} \in \{0,1\} \\ & \color{darkred}{\mathit{pieceValue}}_p \in \{0,1,2,\dots\} \\ & \color{darkred}{\mathit{numDigits}}_p \in \{0,1,2,\dots\} \\ & \color{darkred}{\mathit{cumDigits}}_p \in \{0,1,2,\dots\} \\ & \color{darkred}{\mathit{cumValue}}_p \in \{0,1,2,\dots\} \\ & \color{darkred}{\mathit{mul13}} \in \{0,1,2,\dots\} \end{align}\]


In the above, we assume that  \({\mathit{cumValue}}_{p+1}\) and \({\mathit{cumDigits}}_{p+1}\) have a value of zero when we go beyond the last element. This is automatic in GAMS, so we can follow the above model quite straightforwardly.

Results 


Lo and behold, Baron can solve the model very quickly. The only things I added are reasonable upper bounds on the integer variables and a setting OPTCR=0 in order to search for proven global solutions. The results look like:


----     61 VARIABLE select.L  selection of value

           select1     select2

piece1           1
piece2                       1
piece3           1
piece4           1


----     61 VARIABLE pieceValue.L  value of single piece

piece1  1,    piece2 10,    piece3  1,    piece4  1


----     61 VARIABLE numDigits.L  number of digits of single piece

piece1 1,    piece2 2,    piece3 1,    piece4 1


----     61 VARIABLE cumValue.L  cumulative value

piece1 11011,    piece2  1011,    piece3    11,    piece4     1


----     61 VARIABLE cumDigits.L  cumulative number of digits

piece1 5,    piece2 4,    piece3 2,    piece4 1


----     61 VARIABLE Mul13.L               =          847  multiple of 13
            VARIABLE z.L                   =        11011  objective
 

Complete enumeration


To verify the solution we can use a complete enumeration approach. The total number of cases is \(6^4=1296\). This is not too large. Below is a table that illustrates the enumeration. Only the first 50 records are shown. The column div13 has a one where the concatenated number is a multiple of 13. The column smallest marks the best row.


----     66 PARAMETER data  complete enumeration

           piece1      piece2      piece3      piece4      concat       div13    smallest

k1              1           1           1           1        1111
k2              1           1           1          10       11110
k3              1           1           1          20       11120
k4              1           1           1          40       11140
k5              1           1           1         100      111100
k6              1           1           1         200      111200
k7              1           1          10           1       11101
k8              1           1          10          10      111010
k9              1           1          10          20      111020           1
k10             1           1          10          40      111040
k11             1           1          10         100     1110100
k12             1           1          10         200     1110200           1
k13             1           1          20           1       11201
k14             1           1          20          10      112010
k15             1           1          20          20      112020
k16             1           1          20          40      112040
k17             1           1          20         100     1120100
k18             1           1          20         200     1120200
k19             1           1          40           1       11401           1
k20             1           1          40          10      114010           1
k21             1           1          40          20      114020
k22             1           1          40          40      114040
k23             1           1          40         100     1140100           1
k24             1           1          40         200     1140200
k25             1           1         100           1      111001
k26             1           1         100          10     1110010
k27             1           1         100          20     1110020
k28             1           1         100          40     1110040
k29             1           1         100         100    11100100
k30             1           1         100         200    11100200
k31             1           1         200           1      112001
k32             1           1         200          10     1120010
k33             1           1         200          20     1120020
k34             1           1         200          40     1120040
k35             1           1         200         100    11200100
k36             1           1         200         200    11200200
k37             1          10           1           1       11011           1           1
k38             1          10           1          10      110110           1
k39             1          10           1          20      110120
k40             1          10           1          40      110140
k41             1          10           1         100     1101100           1
k42             1          10           1         200     1101200
k43             1          10          10           1      110101
k44             1          10          10          10     1101010
k45             1          10          10          20     1101020
k46             1          10          10          40     1101040
k47             1          10          10         100    11010100
k48             1          10          10         200    11010200
k49             1          10          20           1      110201           1
k50             1          10          20          10     1102010           1
 

This confirms the solution we found using Baron.


Conclusion


The "string concatenation" problem in the original problem statement is not very easy to model in an MINLP model. We made a heroic effort, and it seems to work for this example. Baron was able to find the optimal solution. However, due to the large numbers involved, I doubt this approach can be used for larger problems. MINLP and MIP algorithms are not very suited for large integer variables. 
 

References


  1. Could this problem be solved using mixed integer non-linear programming or any other optimization technique?, https://stackoverflow.com/questions/63136355/could-this-problem-be-solved-using-mixed-integer-non-linear-programming-or-any-o

No comments:

Post a Comment