Wednesday, January 20, 2021

Continuous max sum rectangle: MIQP vs GA

This is a follow-up on the post on the max sum submatrix problem [1].  There we looked at the problem of finding a contiguous submatrix that maximizes the sum of the values in this submatrix.


A little bit more complicated is the following continuous version of this problem. 

Assume we have \(n\) points with an \(x\)- and \(y\)-coordinate and a value. Find the rectangle such that the sum of the values of all points inside is maximized.


Data set


I generated a random data set with \(n=100\) points:


----     10 PARAMETER p  points

               x           y       value

i1        17.175       5.141       8.655
i2        84.327       0.601      -3.025
i3        55.038      40.123      -9.834
i4        30.114      51.988       8.977
i5        29.221      62.888       1.438
i6        22.405      22.575      -3.327
i7        34.983      39.612       9.675
i8        85.627      27.601       5.329
i9         6.711      15.237      -7.798
i10       50.021      93.632       9.896
i11       99.812      42.266       1.606
i12       57.873      13.466      -6.672
i13       99.113      38.606       2.867
i14       76.225      37.463      -3.114
i15       13.069      26.848       8.247
i16       63.972      94.837       8.001
i17       15.952      18.894      -9.675
i18       25.008      29.751      -2.627
i19       66.893       7.455       3.288
i20       43.536      40.135       1.868
i21       35.970      10.169      -9.309
i22       35.144      38.389       6.836
i23       13.149      32.409       8.642
i24       15.010      19.213       0.159
i25       58.911      11.237      -4.008
i26       83.089      59.656      -0.068
i27       23.082      51.145      -9.101
i28       66.573       4.507       5.474
i29       77.586      78.310       0.659
i30       30.366      94.575       4.935
i31       11.049      59.646       4.401
i32       50.238      60.734       2.632
i33       16.017      36.251      -7.702
i34       87.246      59.407       9.423
i35       26.511      67.985       4.135
i36       28.581      50.659       9.725
i37       59.396      15.925       7.096
i38       72.272      65.689       2.429
i39       62.825      52.388       4.026
i40       46.380      12.440       4.018
i41       41.331      98.672       5.814
i42       11.770      22.812       2.204
i43       31.421      67.565      -8.914
i44        4.655      77.678      -0.296
i45       33.855      93.245      -8.949
i46       18.210      20.124       3.972
i47       64.573      29.714      -6.104
i48       56.075      19.723      -5.479
i49       76.996      24.635       6.273
i50       29.781      64.648       9.835
i51       66.111      73.497       5.013
i52       75.582       8.544       4.367
i53       62.745      15.035      -9.988
i54       28.386      43.419      -4.723
i55        8.642      18.694       6.476
i56       10.251      69.269       6.391
i57       64.125      76.297       7.208
i58       54.531      15.481      -5.746
i59        3.152      38.938      -0.864
i60       79.236      69.543      -9.233
i61        7.277      84.581      -3.540
i62       17.566      61.272      -1.202
i63       52.563      97.597      -3.693
i64       75.021       2.689      -7.305
i65       17.812      18.745       6.219
i66        3.414       8.712      -1.664
i67       58.513      54.040      -7.164
i68       62.123      12.686      -0.689
i69       38.936      73.400      -4.340
i70       35.871      11.323       7.914
i71       24.303      48.835      -8.712
i72       24.642      79.560      -1.708
i73       13.050      49.205      -3.168
i74       93.345      53.356      -0.634
i75       37.994       1.062       2.853
i76       78.340      54.387       2.872
i77       30.003      45.113      -3.248
i78       12.548      97.533      -7.984
i79       74.887      18.385       8.117
i80        6.923      16.353      -5.653
i81       20.202       2.463       8.377
i82        0.507      17.782      -0.965
i83       26.961       6.132      -8.201
i84       49.985       1.664      -2.516
i85       15.129      83.565      -1.700
i86       17.417      60.166      -1.916
i87       33.064       2.702      -7.767
i88       31.691      19.609       5.023
i89       32.209      95.071       6.068
i90       96.398      33.554      -9.527
i91       99.360      59.426      -0.382
i92       36.990      25.919      -4.428
i93       37.289      64.063       8.032
i94       77.198      15.525      -9.648
i95       39.668      46.002       3.621
i96       91.310      39.334       9.018
i97       11.958      80.546       8.004
i98       73.548      54.099       7.976
i99        5.542      39.072       7.489
i100      57.630      55.782      -2.180





High-level model


Let's denote our points by: \(\color{darkblue}p_{i,a}\), where \(a=\{x,y,{\mathit{value}}\}\). Our decision variables are the corner points of the rectangle: \(\color{darkred}r_{c,q}\) where \(c=\{x,y\}\) and \(q=\{min,max\}\). These variables are continuous between \(\color{darkblue}L=0\) and  \(\color{darkblue}U=100\).

High-level model
\[\begin{align}\max & \sum_{i|\color{darkred}r(c,min) \le \color{darkblue}p(i,c)\le \color{darkred}r(c,max)} \color{darkblue}p_{i,{\mathit{value}}}\\ & \color{darkred}r_{c,min} \le \color{darkred}r_{c,max} \\ &\color{darkred}r_{c,q}\in [L,U]\end{align}\]

This looks easy. Well, not so fast...

Tuesday, January 19, 2021

Submatrix with Largest Sum

Problem


Given a data matrix \(\color{darkblue}a_{i,j}\), find a submatrix such that the sum of the elements is maximized. The term "submatrix" can indicate a contiguous or a non-contiguous subset of rows and columns. 

This is a generalization of the 1d maximum subarray problem [1].

Looks easy, but let's make sure it is.

Example data


Of course, the problem is trivial if all elements are non-negative: just select the whole matrix. It is only interesting if there are negative elements. So here is my random matrix:


----      8 PARAMETER a  matrix (data)

             c1          c2          c3          c4          c5          c6          c7          c8          c9         c10

r1       -6.565       6.865       1.008      -3.977      -4.156      -5.519      -3.003       7.125      -8.658       0.004
r2        9.962       1.575       9.823       5.245      -7.386       2.794      -6.810      -4.998       3.379      -1.293
r3       -2.806      -2.971      -7.370      -6.998       1.782       6.618      -5.384       3.315       5.517      -3.927
r4       -7.790       0.048      -6.797       7.449      -4.698      -4.284       1.879       4.454       2.565      -0.724
r5       -1.734      -7.646      -3.716      -9.069      -3.229      -6.358       2.915       1.215       5.399      -4.044
r6        3.222       5.116       2.549      -4.323      -8.272      -7.950       2.825       0.906      -9.370       5.847
r7       -8.545      -6.487       0.513       5.004      -6.438      -9.317       1.703       2.425      -2.213      -2.826
r8       -5.139      -5.072      -7.390       8.669      -2.401       5.668      -3.999      -7.490       4.977      -8.615
r9       -5.960      -9.899      -4.608      -0.003      -6.974      -6.517      -3.387      -3.662      -3.558       9.280
r10       9.872      -2.602      -2.542       5.440      -2.066       8.262      -7.608       4.710      -8.892       1.526
r11      -8.972      -9.880      -1.975       0.398       2.578      -5.485      -2.078      -4.480      -6.953       8.726
r12      -1.547      -7.307      -2.279      -2.507      -4.630       8.967      -6.221      -4.050      -8.509      -1.973
r13      -7.966      -2.322      -3.518      -6.157      -7.753       1.931       0.229      -9.099       5.662       8.915
r14       1.929       2.147      -2.750       1.881       3.597       0.132      -6.815       3.138       0.478      -7.512
r15       9.734      -5.438       3.513       5.536       8.649      -5.975      -4.057      -6.055      -5.073       2.930
r16       4.699      -8.291      -6.993      -1.316      -6.261       3.854       5.259      -6.904      -2.212       3.909
r17       6.916       2.254       9.519      -9.462      -6.251      -8.258       0.808      -7.463       4.680      -7.735
r18      -0.233       5.912      -0.159       0.671      -9.788       0.877      -0.977       9.507      -6.323      -6.729
r19      -9.507      -6.444      -8.774      -9.667       6.713       2.033      -9.460      -6.078       9.014      -3.289
r20       1.885      -4.816       2.813      -6.895      -0.800      -2.133       6.109       0.820      -2.186       1.156
r21       8.655      -3.025      -9.834       8.977       1.438      -3.327       9.675       5.329      -7.798       9.896
r22       1.606      -6.672       2.867      -3.114       8.247       8.001      -9.675      -2.627       3.288       1.868
r23      -9.309       6.836       8.642       0.159      -4.008      -0.068      -9.101       5.474       0.659       4.935
r24       4.401       2.632      -7.702       9.423       4.135       9.725       7.096       2.429       4.026       4.018
r25       5.814       2.204      -8.914      -0.296      -8.949       3.972      -6.104      -5.479       6.273       9.835
r26       5.013       4.367      -9.988      -4.723       6.476       6.391       7.208      -5.746      -0.864      -9.233
r27      -3.540      -1.202      -3.693      -7.305       6.219      -1.664      -7.164      -0.689      -4.340       7.914
r28      -8.712      -1.708      -3.168      -0.634       2.853       2.872      -3.248      -7.984       8.117      -5.653
r29       8.377      -0.965      -8.201      -2.516      -1.700      -1.916      -7.767       5.023       6.068      -9.527
r30      -0.382      -4.428       8.032      -9.648       3.621       9.018       8.004       7.976       7.489      -2.180


The values are drawn from a uniform distribution, \(\color{darkblue}a_{i,j} \sim U(-10,10)\).

Non-contiguous submatrix


The non-convex MIQP formulation is easy. Just introduce binary variables that indicate if a row or column is selected. A cell is selected if both its corresponding row and column are selected.

MIQP model A
\[\begin{align} \max& \sum_{i,j} \color{darkblue}a_{i,j} \cdot \color{darkred}x_i \cdot \color{darkred}y_j\\ &\color{darkred}x_i, \color{darkred}y_j \in \{0,1\} \end{align}\]

Wednesday, January 6, 2021

Eight soldiers lining up: a very large permutation problem

In [1] an intriguing problem is posted:


There are 8 soldiers, gathering and lining up every morning for their military service. The commander at the head of these soldiers demands that the morning lineup of these soldiers be arranged differently for every next day according to the following rule:

        Any three soldiers cannot be lined up next to each other in the same order for others days.

For example; If ABCDEFGH is the first arrangement for day 1, on the other days, ABC,BCD, CDE, DEF, EFG and FGH cannot be lined up next to each other in the same order any more, but ACB arrangement is okay for other days until used once since they are not in the same order.


Let's first make the problem a bit smaller. Say we have just 4 soldiers. This gives us  \(4!=24\) permutations or line-ups. Let's have a look at the following sets: 

  • \(p\) is the set of permutations
  • \(t\) is the set of substrings of length 3 that we can form from \(p\)
  • \(\color{darkblue}pt(p,t)\) is the mapping between \(p\) and \(t\)

----     53 SET p  permutations or line ups

ABCD,    ABDC,    ACBD,    ACDB,    ADBC,    ADCB,    BACD,    BADC,    BCAD,    BCDA,    BDAC,    BDCA,    CABD
CADB,    CBAD,    CBDA,    CDAB,    CDBA,    DABC,    DACB,    DBAC,    DBCA,    DCAB,    DCBA


----     53 SET t  substrings of length 3

ABC,    BCD,    ABD,    BDC,    ACB,    CBD,    ACD,    CDB,    ADB,    DBC,    ADC,    DCB,    BAC,    BAD,    BCA
CAD,    CDA,    BDA,    DAC,    DCA,    CAB,    CBA,    DAB,    DBA


----     55 SET pt  mapping (computed in Python)

ABCD.ABC,    ABCD.BCD,    ABDC.ABD,    ABDC.BDC,    ACBD.ACB,    ACBD.CBD,    ACDB.ACD
ACDB.CDB,    ADBC.ADB,    ADBC.DBC,    ADCB.ADC,    ADCB.DCB,    BACD.ACD,    BACD.BAC
BADC.ADC,    BADC.BAD,    BCAD.BCA,    BCAD.CAD,    BCDA.BCD,    BCDA.CDA,    BDAC.BDA
BDAC.DAC,    BDCA.BDC,    BDCA.DCA,    CABD.ABD,    CABD.CAB,    CADB.ADB,    CADB.CAD
CBAD.BAD,    CBAD.CBA,    CBDA.CBD,    CBDA.BDA,    CDAB.CDA,    CDAB.DAB,    CDBA.CDB
CDBA.DBA,    DABC.ABC,    DABC.DAB,    DACB.ACB,    DACB.DAC,    DBAC.BAC,    DBAC.DBA
DBCA.DBC,    DBCA.BCA,    DCAB.DCA,    DCAB.CAB,    DCBA.DCB,    DCBA.CBA


----     66 PARAMETER np                   =           24  card(p)
            PARAMETER nt                   =           24  card(t)
            PARAMETER npt                  =           48  card(pt)


In [1], Rob Pratt proposes a MIP model for this:

Model A
\[\begin{align}\max\> & \color{darkred}z = \sum_p \color{darkred}x_p \\ &\sum_{p|\color{darkblue}{pt}(p,t)} \color{darkred}x_p \le 1&& \forall t \\ & \color{darkred}x_p \in \{0,1\}\end{align} \]

Monday, January 4, 2021

Continuous Facility Location: Symmetry, SOCP and grid search

In this post, I consider the following question: 

Given \(m\) consumer locations, what is the minimum number of facilities (warehouses)  needed, \(n\), so each consumer is within \(K\) kilometers of a warehouse. 

A second, related question is:

Where should we place these \(n\) warehouses? 

I looked at this earlier in [1]. I want to add some new experiments: 

  • Do something about symmetry: breaking symmetry by simple ordering constraints. These can be highly effective (but can also disappoint).
  • See if I can formulate the "min sum distance" model as a convex SOCP (second-order cone programming) problem.
  • Compare with a discrete location problem: place facilities on a grid. This makes the model very large but linear. How does it compare to the quadratic models that place facilities anywhere?

Generate consumer locations.


I just use random locations drawn from a uniform distribution: \[\color{darkblue}p_{i,c} \sim U(0,\color{darkblue}S)\] where \(c \in \{x,y\}\) are coordinates and \(\color{darkblue}S=100\). Of course, this is not very realistic. 

Q: What would be a good way to generate more realistic locations? Typically, locations are clustered a bit. Maybe: uniform locations with some random count of consumers at each location.   

For my experiments, I used a dataset \(m=50\) consumer locations. This is both small enough to make the models solve quickly and large enough to show differences in performance.

  

----     37 PARAMETER p  points

              x           y

p1       17.175      84.327
p2       55.038      30.114
p3       29.221      22.405
p4       34.983      85.627
p5        6.711      50.021
p6       99.812      57.873
p7       99.113      76.225
p8       13.069      63.972
p9       15.952      25.008
p10      66.893      43.536
p11      35.970      35.144
p12      13.149      15.010
p13      58.911      83.089
p14      23.082      66.573
p15      77.586      30.366
p16      11.049      50.238
p17      16.017      87.246
p18      26.511      28.581
p19      59.396      72.272
p20      62.825      46.380
p21      41.331      11.770
p22      31.421       4.655
p23      33.855      18.210
p24      64.573      56.075
p25      76.996      29.781
p26      66.111      75.582
p27      62.745      28.386
p28       8.642      10.251
p29      64.125      54.531
p30       3.152      79.236
p31       7.277      17.566
p32      52.563      75.021
p33      17.812       3.414
p34      58.513      62.123
p35      38.936      35.871
p36      24.303      24.642
p37      13.050      93.345
p38      37.994      78.340
p39      30.003      12.548
p40      74.887       6.923
p41      20.202       0.507
p42      26.961      49.985
p43      15.129      17.417
p44      33.064      31.691
p45      32.209      96.398
p46      99.360      36.990
p47      37.289      77.198
p48      39.668      91.310
p49      11.958      73.548
p50       5.542      57.630





Below I will discuss some models that can be used to attack this problem. 

Thursday, December 24, 2020

Numbrix puzzle via TSP

 In [1], I demonstrated a small and rather elegant MIP model to solve Numbrix puzzles. Here is the comment by Austin Buchanan:

It would also be interesting to see how this compares to an MTZ-like model. And how well the two models scale to larger grids.

The Miller-Tucker-Zemlin [2,3] formulation for the Traveling Salesman Problem (TSP) starts with a standard assignment problem, and then adds some sub-tour elimination constraints. From [3]:


Note:  here \(\color{darkblue}p\ge \color{darkblue}n\). This 1960 paper has some interesting computational tests. Many runs ended with: "the machine was stopped after over 250 pivot steps had failed to produce the solution."

Tuesday, December 22, 2020

Numbrix puzzle via MIP

The puzzle of numbrix [1,2] is simple to state.


  • We have a 9x9 board with 81 cells. 
  • We need to fill each cell with a unique integer value between 1 and 81.
  • There are some prefilled cells.
  • We should be able to start with the cell containing value 1 and then following a path (only going one up, down, left or right) to the next values and thus eventually arriving at the cell with value 81.  





Example

Below is an example of an input (the "given" cells) and a solution. The colors may help to see that we have a path from the cell with value 1 to the cell with value 81.


Of course, we want to see if we can create a MIP model for this.

Monday, December 21, 2020

GAMS + R Markdown scripting

In [1] I discussed a small script that exported GAMS data to R for plotting. Here I expand on that and show how we can use R Markdown [2] to generate documents that contain GAMS data and results. Here we have just a small example with just a parameter. But it is easy to see that this can also be used with larger models, where we have a document built around more complex model results. 

R Markdown can handle as input text, R code, R graphics and LaTeX math (among others). Output can be HTML, PDF (via LaTeX), and DOCX (MS Word).

Saturday, December 19, 2020

GAMS + R scripting

 I like to automate things. That means in practice: I write scripts. I prefer to spend a little more time to develop a script, basically for two reasons:

  • A script can be executed again with little or no effort.
  • A script serves as documentation. 
This helps to make things reproducible.

Here is a setup I often use for post-processing GAMS results with R. 

Tuesday, December 15, 2020

Modeling a simple implication

From [1]:

How to model:  

if \(\color{darkred}\delta=0\) then \(\color{darkred}x=0\) else \(\color{darkred}x=\color{darkred}x\)

where \(\color{darkred}\delta\in \{0,1\}\) is a binary variable and \(\color{darkred}x\ge 0\) is a non-negative continuous variable. 


The else part looks strange, but it is meant to indicate: unrestricted. So better is to just drop the else part:


if \(\color{darkred}\delta=0\) then \(\color{darkred}x=0\)


Let's list some possible formulations. 

Friday, December 11, 2020

Electronic Amoeba for solving TSPs?


Abstract

Combinatorial optimization to search for the best solution across a vast number of legal candidates requires the development of a domain-specific computing architecture that can exploit the computational power of physical processes, as conventional general-purpose computers are not powerful enough. Recently, Ising machines that execute quantum annealing or related mechanisms for rapid search have attracted attention. These machines, however, are hard to map application problems into their architecture, and often converge even at an illegal candidate. Here, we demonstrate an analogue electronic computing system for solving the travelling salesman problem, which mimics efficient foraging behaviour of an amoeboid organism by the spontaneous dynamics of an electric current in its core and enables a high problem-mapping flexibility and resilience using a resistance crossbar circuit. The system has high application potential, as it can determine a high-quality legal solution in a time that grows proportionally to the problem size without suffering from the weaknesses of Ising machines.

Picture of the electronic amoeba (from [2])



"Electronic amoeba" sounds awful. Furthermore, the paper talks about "legal solutions". Now the lawyers even get involved... 
 

References

  1. 'Electronic amoeba' finds approximate solution to traveling salesman problem in linear time, https://www.sciencedaily.com/releases/2020/12/201210112050.htm
  2. Kenta Saito, Masashi Aono and Seiya Kasai, Amoeba-inspired analog electronic computing system integrating resistance crossbar for solving the travelling salesman problem, https://www.nature.com/articles/s41598-020-77617-7