Yet Another Math Programming Consultant: August 2020

Monday, August 31, 2020

Speed dating

Speed Networking. [source]

In [1] a speed dating (or speed networking [2]) problem is proposed, where parties representing buyers and sellers can have a short meeting with different parties.

There are 10 buyers,
and 10 sellers.
Data is available telling us which buyers/sellers are interested in talking to each other.
Meetings are organized in 15-minute rounds, with a limited number of rounds (say 8 or 10).

Interestingly, the single word "SpeedDating" is trade-marked, while "Speed Dating" is (still) free to use [3].

Data

I generated some random data so we have something to solve:

----     13 SET b  buyers

buyer1 ,    buyer2 ,    buyer3 ,    buyer4 ,    buyer5 ,    buyer6 ,    buyer7 ,    buyer8 ,    buyer9 ,    buyer10


----     13 SET s  sellers

seller1 ,    seller2 ,    seller3 ,    seller4 ,    seller5 ,    seller6 ,    seller7 ,    seller8 ,    seller9 
seller10


----     13 SET r  rounds

round1,    round2,    round3,    round4,    round5,    round6,    round7,    round8


----     13 SET wantMeeting  a meeting has been requested

            seller1     seller2     seller3     seller4     seller5     seller6     seller7     seller8     seller9

buyer1          YES                                                                                             YES
buyer2                                                          YES                     YES
buyer3                                  YES         YES
buyer4          YES                     YES
buyer5                      YES                     YES                     YES
buyer6                                                          YES         YES                                 YES
buyer7          YES         YES                                 YES         YES
buyer8                                  YES                                                         YES
buyer9                      YES                                 YES         YES
buyer10                                                                                 YES                     YES

      +    seller10

buyer8          YES

Model

To model this we can use the following binary variable: \[x_{b,s,r} = \begin{cases} 1 & \text{if buyer $b$ meets seller $s$ in round $r$}\\ 0 & \text{otherwise}\end{cases}\] Instead of using a fully allocated variable, it is better to use a sparse variant: \[x_{b,s,r} = \begin{cases} 1 & \text{if buyer $b$ meets seller $s$ in round $r$} && \forall b,s,r|\mathit{wantMeeting}_{b,s} \\ 0 & \text{if buyer $b$ does not meet seller $s$ in round $r$} &&\forall b,s,r|\mathit{wantMeeting}_{b,s} \\ \text{not allocated} & && \forall b,s,r| \bf{not }\mathit{wantMeeting}_{b,s} \end{cases}\]

With this, we can formulate the model:

MIP model
\[\large\begin{align} \min \> & \color{darkred}{\mathit{numRounds}} && && (1) \\ & \sum_r \color{darkred}x_{b,s,r} = 1 && \forall b,s\|\color{darkblue}{\mathit{wantMeeting}}_{b,s} && (2) \\ & \sum_{b\|\color{darkblue}{\mathit{wantMeeting}}(b,s)} \color{darkred}x_{b,s,r} \le 1 && \forall s,r&& (3)\\ & \sum_{s\|\color{darkblue}{\mathit{wantMeeting}}(b,s)} \color{darkred}x_{b,s,r} \le 1 && \forall b,r && (4) \\ & \color{darkred}{\mathit{round}}_r \ge \color{darkred}x_{b,s,r} && \forall b,s,r\| \color{darkblue}{\mathit{wantMeeting}}_{b,s} && (5) \\ & \color{darkred}{\mathit{round}}_r \ge \color{darkred}{\mathit{round}}_{r+1} && \forall r \in \{1,\dots,R-1\} && (6) \\ &\color{darkred}{\mathit{numRounds}} = \sum_r \color{darkred}{\mathit{round}}_r && &&(7) \\ & \color{darkred}x_{b,s,r} \in \{0,1\} \\ & \color{darkred}{\mathit{round}}_r \in \{0,1\} \end{align} \]

MIP model

\[\large\begin{align} \min \> & \color{darkred}{\mathit{numRounds}} && && (1) \\ & \sum_r \color{darkred}x_{b,s,r} = 1 && \forall b,s|\color{darkblue}{\mathit{wantMeeting}}_{b,s} && (2) \\ & \sum_{b|\color{darkblue}{\mathit{wantMeeting}}(b,s)} \color{darkred}x_{b,s,r} \le 1 && \forall s,r&& (3)\\ & \sum_{s|\color{darkblue}{\mathit{wantMeeting}}(b,s)} \color{darkred}x_{b,s,r} \le 1 && \forall b,r && (4) \\ & \color{darkred}{\mathit{round}}_r \ge \color{darkred}x_{b,s,r} && \forall b,s,r| \color{darkblue}{\mathit{wantMeeting}}_{b,s} && (5) \\ & \color{darkred}{\mathit{round}}_r \ge \color{darkred}{\mathit{round}}_{r+1} && \forall r \in \{1,\dots,R-1\} && (6) \\ &\color{darkred}{\mathit{numRounds}} = \sum_r \color{darkred}{\mathit{round}}_r && &&(7) \\ & \color{darkred}x_{b,s,r} \in \{0,1\} \\ & \color{darkred}{\mathit{round}}_r \in \{0,1\} \end{align} \]

Some explanation for each equation:

Objective. We minimize the number of rounds we need, in order to form a compact schedule.
This constraint makes sure that any pair that wants a meeting, gets one.
A seller $s$ can not be double booked during a round.
Similarly, a buyer $b$ can only do one meeting per round.
Implication: $x_{b,s,r}=1 \Rightarrow \mathit{round}_r=1$,
Order the used rounds. I.e. the first used round is number one.
Calculate the number of used rounds.

Note that every reference to $x_{b,s,r}$ is protected: only for possible combinations $(b,s)$ this variable is used.

Results

----     47 VARIABLE x.L  meetings

                      round1      round2      round3      round4

buyer1 .seller1            1
buyer1 .seller9                                                1
buyer2 .seller5            1
buyer2 .seller7                                    1
buyer3 .seller3                                    1
buyer3 .seller4            1
buyer4 .seller1                                    1
buyer4 .seller3            1
buyer5 .seller2            1
buyer5 .seller4                                    1
buyer5 .seller6                        1
buyer6 .seller5                                                1
buyer6 .seller6                                    1
buyer6 .seller9            1
buyer7 .seller1                                                1
buyer7 .seller2                        1
buyer7 .seller5                                    1
buyer7 .seller6            1
buyer8 .seller3                        1
buyer8 .seller8            1
buyer8 .seller10                                   1
buyer9 .seller2                                    1
buyer9 .seller5                        1
buyer9 .seller6                                                1
buyer10.seller7            1
buyer10.seller9                                    1


----     47 VARIABLE round.L  round is used

round1 1,    round2 1,    round3 1,    round4 1


----     47 VARIABLE numRounds.L           =            4  number of rounds needed

We can verify that every buyer and every seller is not assigned multiple times to a meeting during the same round.

Capacity constraint

In the model above we assumed no limit on the available tables. In practice there may be such a limit. This can easily be fixed by adding the constraint \[\sum_{b,s|\mathit{wantMeeting}(b,s)} x_{b,s,r} \le \mathit{numTables} \> \forall r \]

Using $\mathit{numTables}=5$ we see:

----     47 VARIABLE x.L  meetings

                      round1      round2      round3      round4      round5      round6

buyer1 .seller1            1
buyer1 .seller9                        1
buyer2 .seller5                                                            1
buyer2 .seller7                                    1
buyer3 .seller3                                                                        1
buyer3 .seller4                                    1
buyer4 .seller1                        1
buyer4 .seller3                                                            1
buyer5 .seller2                                    1
buyer5 .seller4                                                            1
buyer5 .seller6                        1
buyer6 .seller5                        1
buyer6 .seller6                                                1
buyer6 .seller9                                    1
buyer7 .seller1                                                1
buyer7 .seller2                        1
buyer7 .seller5                                                                        1
buyer7 .seller6                                    1
buyer8 .seller3                                                1
buyer8 .seller8            1
buyer8 .seller10                                                                       1
buyer9 .seller2                                                                        1
buyer9 .seller5                                                1
buyer9 .seller6            1
buyer10.seller7                                                                        1
buyer10.seller9            1


----     47 VARIABLE round.L  round is used

round1 1,    round2 1,    round3 1,    round4 1,    round5 1,    round6 1


----     47 VARIABLE numRounds.L           =            6  number of rounds needed

As expected, fewer tables will lead to a longer schedule.

GAMS model

Here is the GAMS representation of the MIP model.

$ontext

   small demo model for designing a schedule for a speed dating event

   erwin@amsterdamoptimization.com

   1. we have 10 buyers and 10 sellers
   2. given is a set with wanted meetings between buyers and sellers
   3. up to 8 rounds
   4. find the most compact schedule for this problem

$offtext

set
b 'buyers' /buyer1*buyer10/
s 'sellers' /seller1*seller10/
r 'rounds' /round1*round8/
;

scalar numTables 'number of available tables' /5/;

set wantMeeting(b,s) 'a meeting has been requested';
wantMeeting(b,s)$(uniform(0,1)<0.2) = yes;

option wantMeeting:0;
display b,s,r,wantMeeting;

binary variable x(b,s,r) 'meetings';
binary variable round(r) 'round is used';
variable numRounds 'number of rounds needed';

equations
   doMeet(b,s)      'all wanted meetings are required'
   SellerBusy(s,r) 'seller can do 0 or 1 meetings in a single round'
   BuyerBusy(b,r)   'byuer can do 0 or 1 meetings in a single round'
   MaxTables(r)     'capacity constraint'
   useRound(b,s,r) 'implication: x=1 => round=1'
   order(r)         'ordering of rounds'
   obj              'objective'
;
doMeet(wantMeeting(b,s)).. sum(r, x(b,s,r)) =e= 1;

SellerBusy(s,r).. sum(wantMeeting(b,s),x(b,s,r)) =l= 1;

BuyerBusy(b,r).. sum(wantMeeting(b,s),x(b,s,r)) =l= 1;

MaxTables(r)$numTables.. sum(wantMeeting(b,s), x(b,s,r)) =l= numTables;

useRound(wantMeeting(b,s),r).. round(r) =g= x(b,s,r);

order(r+1).. round(r) =g= round(r+1);

obj.. numRounds =e= sum(r, round(r));

model m /all/;
option optcr=0;
solve m minimizing numRounds using mip;

option x:0,round:0,numrounds:0;
display x.l,round.l,numrounds.l;

Notes:

option optcr=0 sets the allowed gap to zero, So we search for a proven optimal solution.
equation order drops the last r from consideration. Somewhat of a minor detail: the model would still work without this precaution.
Notice how the body of the constraints 2, 3, and 4 are the same. The meaning is determined by the for-all construct.
An alternative is to write: sum(b$wantMeeting(b,s),x(b,s,r)), sum(s$wantMeeting(b,s),x(b,s,r)), and sum((b,s)$wantMeeting(b,s),x(b,s,r))
If numTables=0, we skip the capacity constraint.
In GAMS we just declare x(b,s,r). Only variables referenced in the constraints are actually generated and passed on to the solver. Not all modeling tools use this approach. For instance, CVXPY allocates all declared variables and is not very good at handling sparse variables. Furthermore, CVXPY does not support 3-dimensional variables.

Some remaining questions or experiments:

If the maximum number of rounds is not sufficient, try to allocate as many meetings as possible.
Can we integrate this into one model, or do we need two models (if feasible, minimize the number of rounds, and if infeasible maximize the number of meetings)?
Can we order the solution by the number of meetings in a round (say busy rounds with most meetings first)?
The display of the solution can be improved. E.g. more compact is:

This is not a standard GAMS output.

References

Automated meetings schedule planner, https://stackoverflow.com/questions/63662477/automated-meetings-schedule-planner
Speed networking, https://en.wikipedia.org/wiki/Speed_networking
Speed Dating, https://en.wikipedia.org/wiki/Speed_dating
Speed dating scheduling, https://yetanothermathprogrammingconsultant.blogspot.com/2018/04/speed-dating-scheduling.html This is a model for a related problem. In this problem, the participants are not from two different groups but form one big pool. This makes the problem larger, as we have more possible meetings to consider.

Wednesday, August 26, 2020

MINLP vs MIQCP

In [1] a small non-convex MINLP is stated. The elements are:

We have binary variables $x_i \in \{0,1\}$ for $i=1,\dots,52$
The data is comprised of three arrays: $a_i$, $b_i$ and $c_i$
We want to minimize the following objective: \[\begin{align} - &\left(\frac{1166}{2000} \sum_i a_i \cdot x_i\right) \times \\ & \left(\frac{1}{2100} \sum_i b_i \cdot x_i + 0.05\right) \times \\ & \left(\frac{1}{1500} \sum_i c_i \cdot x_i + 1.5\right)\end{align}\]
Of the first four elements $x_i$, only one has the value 1, and the other three are 0. Same thing for the second, third, etc. groups of four.

Data

----     66 SET i  items

i1 ,    i2 ,    i3 ,    i4 ,    i5 ,    i6 ,    i7 ,    i8 ,    i9 ,    i10,    i11,    i12,    i13,    i14,    i15
i16,    i17,    i18,    i19,    i20,    i21,    i22,    i23,    i24,    i25,    i26,    i27,    i28,    i29,    i30
i31,    i32,    i33,    i34,    i35,    i36,    i37,    i38,    i39,    i40,    i41,    i42,    i43,    i44,    i45
i46,    i47,    i48,    i49,    i50,    i51,    i52


----     66 SET g  groups

group1 ,    group2 ,    group3 ,    group4 ,    group5 ,    group6 ,    group7 ,    group8 ,    group9 ,    group10
group11,    group12,    group13


----     66 SET group  grouping of items

                 i1          i2          i3          i4          i5          i6          i7          i8          i9

group1          YES         YES         YES         YES
group2                                                          YES         YES         YES         YES
group3                                                                                                          YES

      +         i10         i11         i12         i13         i14         i15         i16         i17         i18

group3          YES         YES         YES
group4                                              YES         YES         YES         YES
group5                                                                                              YES         YES

      +         i19         i20         i21         i22         i23         i24         i25         i26         i27

group5          YES         YES
group6                                  YES         YES         YES         YES
group7                                                                                  YES         YES         YES

      +         i28         i29         i30         i31         i32         i33         i34         i35         i36

group7          YES
group8                      YES         YES         YES         YES
group9                                                                      YES         YES         YES         YES

      +         i37         i38         i39         i40         i41         i42         i43         i44         i45

group10         YES         YES         YES         YES
group11                                                         YES         YES         YES         YES
group12                                                                                                         YES

      +         i46         i47         i48         i49         i50         i51         i52

group12         YES         YES         YES
group13                                             YES         YES         YES         YES


----     66 PARAMETER data  

              a           b           c

i1      251.000     179.000     179.000
i2      179.000     251.000     179.000
i3      215.000     215.000     118.000
i4      251.000                 179.000
i5       63.000      45.000      45.000
i6       45.000      63.000      45.000
i7       54.000      54.000      30.000
i8       63.000                  45.000
i9       47.000      34.000      34.000
i10      34.000      47.000      34.000
i11      40.000      40.000      22.000
i12      47.000                  34.000
i13     141.000     101.000     101.000
i14     101.000     141.000     101.000
i15     121.000     121.000      67.000
i16     141.000                 101.000
i17      47.000      34.000      34.000
i18      34.000      47.000      34.000
i19      40.000      40.000      22.000
i20      47.000                  34.000
i21      94.000      67.000      67.000
i22      67.000      94.000      67.000
i23      81.000      81.000      44.000
i24      94.000                  67.000
i25      47.000      34.000      34.000
i26      34.000      47.000      34.000
i27      40.000      40.000      22.000
i28      47.000                  34.000
i29     157.000     108.000     108.000
i30     108.000     157.000     108.000
i31     133.000     133.000      71.000
i32     157.000                 108.000
i33     126.000      85.000      85.000
i34      85.000     126.000      85.000
i35     106.000     106.000      56.000
i36     126.000                  85.000
i37     126.000      85.000      85.000
i38      85.000     126.000      85.000
i39     106.000     106.000      56.000
i40     126.000                  85.000
i41     110.000      74.000      74.000
i42      74.000     110.000      74.000
i43      92.000      92.000      49.000
i44     110.000                  74.000
i45     110.000      74.000      74.000
i46      74.000     110.000      74.000
i47      92.000      92.000      49.000
i48     110.000                  74.000
i49      63.000      40.000      40.000
i50      40.000      63.000      40.000
i51      52.000      52.000      27.000
i52      63.000                  40.000

The set group was populated using the GAMS statement:

set group(g,i) 'grouping of items';
group(g,i) = 1+floor((ord(i)-0.5)/4)=ord(g);

The term 0.5 is to make sure we don't take the floor of integer values (the floor function can then deliver different results depending on the very last bit - possibly dangerous when we deal with floating-point computations).

Direct MINLP formulation

A direct MINLP model formulation can look like:

MINLP model A
\[\begin{align}\min\> - &\left(\frac{1166}{2000} \sum_i \color{darkblue}a_i \cdot \color{darkred}x_i\right) \times \\ & \left(\frac{1}{2100} \sum_i \color{darkblue}b_i \cdot \color{darkred}x_i + 0.05\right) \times \\ & \left(\frac{1}{1500} \sum_i \color{darkblue}c_i \cdot \color{darkred}x_i + 1.5\right)\\ & \sum_{i\|\color{darkblue}{\mathit{group}}(g,i)} \color{darkred}x_i = 1 &&\forall g\\ & \color{darkred}x_i \in \{0,1\}\end{align}\]

MINLP model A

\[\begin{align}\min\> - &\left(\frac{1166}{2000} \sum_i \color{darkblue}a_i \cdot \color{darkred}x_i\right) \times \\ & \left(\frac{1}{2100} \sum_i \color{darkblue}b_i \cdot \color{darkred}x_i + 0.05\right) \times \\ & \left(\frac{1}{1500} \sum_i \color{darkblue}c_i \cdot \color{darkred}x_i + 1.5\right)\\ & \sum_{i|\color{darkblue}{\mathit{group}}(g,i)} \color{darkred}x_i = 1 &&\forall g\\ & \color{darkred}x_i \in \{0,1\}\end{align}\]

The solution can look like:

----     89 VARIABLE x.L  

i1  1.000,    i5  1.000,    i9  1.000,    i14 1.000,    i17 1.000,    i21 1.000,    i25 1.000,    i29 1.000
i34 1.000,    i38 1.000,    i41 1.000,    i46 1.000,    i49 1.000


----     89 VARIABLE z.L                   =     -889.346  obj

Of course, we see 13 items selected (one for each group of 4).

This is a small model, so we can try a few different global and local solvers.

Solver	Type	Obj	Time	Nodes
Couenne	global	-889.3463	18.25	9972
Baron	global	-889.3463	7.18	389
Antigone	global	Stuck in preprocessing
Bonmin	local	-889.3463	89	2280

Interesting:

Antigone has problems during preprocessing
The local solver Bonmin finds the global optimum but is somewhat slow

Alternative formulation

There is an alternative formulation. We can substitute out the linear parts of the objective function. This will make the model larger (extra variables and equations), but less non-linear (e.g. in terms of non-linear variables).

MINLP model B
\[\begin{align}\max & \prod_k\color{darkred}v_k \\ & \color{darkred}v_1 = \frac{1166}{2000} \sum_i \color{darkblue}a_i \cdot \color{darkred}x_i \\ & \color{darkred}v_2 = \frac{1}{2100} \sum_i \color{darkblue}b_i \cdot \color{darkred}x_i + 0.05 \\ & \color{darkred}v_3 = \frac{1}{1500} \sum_i \color{darkblue}c_i \cdot \color{darkred}x_i + 1.5\\ & \sum_{i\|\color{darkblue}{\mathit{group}}(g,i)} \color{darkred}x_i = 1 &&\forall g\\ & \color{darkred}x_i \in \{0,1\}\end{align}\]

MINLP model B

\[\begin{align}\max & \prod_k\color{darkred}v_k \\ & \color{darkred}v_1 = \frac{1166}{2000} \sum_i \color{darkblue}a_i \cdot \color{darkred}x_i \\ & \color{darkred}v_2 = \frac{1}{2100} \sum_i \color{darkblue}b_i \cdot \color{darkred}x_i + 0.05 \\ & \color{darkred}v_3 = \frac{1}{1500} \sum_i \color{darkblue}c_i \cdot \color{darkred}x_i + 1.5\\ & \sum_{i|\color{darkblue}{\mathit{group}}(g,i)} \color{darkred}x_i = 1 &&\forall g\\ & \color{darkred}x_i \in \{0,1\}\end{align}\]

This formulation has more linear constraints, but in return, we get a much simpler non-linear objective. Note that we also got rid of the minus sign in the objective. This was probably put in there to convert the problem to a minimization problem.

The solution is obviously the same:

----    122 VARIABLE x.L  

i1  1.000,    i6  1.000,    i9  1.000,    i13 1.000,    i17 1.000,    i22 1.000,    i25 1.000,    i29 1.000
i33 1.000,    i38 1.000,    i42 1.000,    i46 1.000,    i49 1.000


----    122 VARIABLE v.L  obj factors

obj1 713.592,    obj2   0.582,    obj3   2.140


----    122 VARIABLE z.L                   =      889.346  obj

Some of the solvers perform better on this model:

Solver	Type	Obj	Time	Nodes
Couenne	global	889.3463	5.1	4398
Baron	global	889.3463	9.5	309
Antigone	global	889.3463	2	5
Bonmin	local	889.3463	105	3336

Couenne and especially Antigone do much better on this formulation.

Non-convex quadratic formulation

Gurobi can solve non-convex quadratic models, so it is interesting to reformulate the problem into a form that Gurobi can digest.

MIQCP model
\[\begin{align}\max\> & \color{darkred}v_1 \cdot \color{darkred}v_{23} \\& \color{darkred}v_{23} = \color{darkred}v_{2} \cdot \color{darkred}v_{3}\\ & \color{darkred}v_1 = \frac{1166}{2000} \sum_i \color{darkblue}a_i \cdot \color{darkred}x_i \\ & \color{darkred}v_2 = \frac{1}{2100} \sum_i \color{darkblue}b_i \cdot \color{darkred}x_i + 0.05 \\ & \color{darkred}v_3 = \frac{1}{1500} \sum_i \color{darkblue}c_i \cdot \color{darkred}x_i + 1.5\\ & \sum_{i\|\color{darkblue}{\mathit{group}}(g,i)} \color{darkred}x_i = 1 &&\forall g\\ & \color{darkred}x_i \in \{0,1\}\end{align}\]

MIQCP model

\[\begin{align}\max\> & \color{darkred}v_1 \cdot \color{darkred}v_{23} \\& \color{darkred}v_{23} = \color{darkred}v_{2} \cdot \color{darkred}v_{3}\\ & \color{darkred}v_1 = \frac{1166}{2000} \sum_i \color{darkblue}a_i \cdot \color{darkred}x_i \\ & \color{darkred}v_2 = \frac{1}{2100} \sum_i \color{darkblue}b_i \cdot \color{darkred}x_i + 0.05 \\ & \color{darkred}v_3 = \frac{1}{1500} \sum_i \color{darkblue}c_i \cdot \color{darkred}x_i + 1.5\\ & \sum_{i|\color{darkblue}{\mathit{group}}(g,i)} \color{darkred}x_i = 1 &&\forall g\\ & \color{darkred}x_i \in \{0,1\}\end{align}\]

Note that this requires the Gurobi option nonConvex=2.

Solver	Type	Obj	Time	Nodes
Gurobi	global	889.3463	0.26	6985

Gurobi solves this model very quickly, although the node count is not that low.

Conclusion

Even this small MINLP model has some interesting wrinkles and reformulations.

References

R Binary Integer Optimization with Groups, https://stackoverflow.com/questions/63584707/r-binary-integer-optimization-with-groups

Monday, August 17, 2020

A variant of the max-sum-subarray problem

In [1] a variant of the max-sum-subarray problem is posed:

Suppose you have two arrays each containing the same amount of "1"s and "-1"s. Additionally, suppose each "1" in the first array has a corresponding or sibling "1" in the second array. Same for each "-1". The task is to find the optimal subarrays, one in the first array and one in the second, such that their combined sum is the largest (maximal) with the added constraint that an element in one subarray only counts towards the sum if the other subarray contains its sibling.

Example

In [1] the following example is given:

An optimal solution is: select elements 3 through 10 in the first array and elements 5 through 10 in the second array, with a total objective of 8.

Reviewing the max-sum-subarray problem

Let's first review the max-sum-subarray problem [2]. We have data $a_i$ with both positive and negative numbers. For example:

----      5 SET i  number of elements

i1 ,    i2 ,    i3 ,    i4 ,    i5 ,    i6 ,    i7 ,    i8 ,    i9 ,    i10,    i11,    i12,    i13,    i14,    i15
i16,    i17,    i18,    i19,    i20,    i21,    i22,    i23,    i24,    i25,    i26,    i27,    i28,    i29,    i30
i31,    i32,    i33,    i34,    i35,    i36,    i37,    i38,    i39,    i40,    i41,    i42,    i43,    i44,    i45
i46,    i47,    i48,    i49,    i50


----      5 PARAMETER a  data

i1  -5.280,    i2   8.640,    i3   2.006,    i4  -0.865,    i5  -8.367,    i6   1.227,    i7   2.775,    i8   0.607
i9   5.950,    i10  6.512,    i11  4.469,    i12 -5.984,    i13 -5.207,    i14  0.889,    i15  2.652,    i16  6.418
i17 -5.499,    i18 -9.400,    i19 -6.543,    i20 -6.948,    i21 -1.842,    i22 -6.746,    i23  5.136,    i24  8.559
i25 -6.108,    i26  9.186,    i27 -5.854,    i28  6.325,    i29  6.854,    i30 -8.525,    i31 -8.762,    i32  2.467
i33 -3.522,    i34  3.210,    i35 -2.523,    i36 -2.771,    i37  1.839,    i38 -1.234,    i39  2.501,    i40  4.430
i41  3.220,    i42 -9.646,    i43  6.979,    i44 -4.781,    i45 -6.631,    i46  9.047,    i47  0.211,    i48  6.068
i49 -7.734,    i50  7.059

A standard approach is to introduce two parallel binary variables: \[x_i = \begin{cases} 1 & \text{if data item $a_i$ is selected to be included in the subarray} \\ 0 &\text{otherwise}\end{cases}\] and \[\mathit{start}_i = \begin{cases} 1 & \text{when we switch from $x_{i-1}=0$ to $x_{i}=1$} \\ 0 &\text{otherwise}\end{cases}\] The integer programming model can look like:

max-sum-subarray MIP formulation
\[\begin{align}\max\>& \color{darkred} z = \sum_i \color{darkblue}a_i \cdot \color{darkred}x_i \\ & \color{darkred}{\mathit{start}}_i \ge \color{darkred}x_i - \color{darkred}x_{i-1} \\ & \sum_i \color{darkred}{\mathit{start}}_i \le 1 \\ & \color{darkred}x_i \in \{0,1\} \\ & \color{darkred}{\mathit{start}}_i \in \{0,1\} \end{align}\]

We ignored here the else part in the definition of $\mathit{start}$ and just implement the implication: \[x_{i-1}=0\>{\bf and}\>x_i=1 \Rightarrow \mathit{start}_i=1\] Note that the missing else part leaves variables $\mathit{start}_i$ unrestricted when we do not switch from 0 to 1. As we only have an upper limit on the number of starts this is no problem. Remarkably, if we want, we can relax the variable $\mathit{start}_i $ to be continuous between 0 and 1.

When we apply the model on our data set, we see the following results:

----     28 VARIABLE x.L  select a number

i23 1.000,    i24 1.000,    i25 1.000,    i26 1.000,    i27 1.000,    i28 1.000,    i29 1.000


----     28 VARIABLE start.L  start of subarray

i23 1.000


----     28 VARIABLE z.L                   =       24.099  objective

This shows that our optimal sub-array consists of elements 23 through 29.

Original problem

In the original problem, we have two data arrays $a_i$ and $b_i$. In addition, we have a mapping from $a$ to $b$. We can encode the original data set as:

----     12 SET i  number of elements

i1 ,    i2 ,    i3 ,    i4 ,    i5 ,    i6 ,    i7 ,    i8 ,    i9 ,    i10


----     12 SET k  array

a,    b


----     12 PARAMETER d  data arrays

           i1          i2          i3          i4          i5          i6          i7          i8          i9

a       1.000      -1.000       1.000       1.000      -1.000      -1.000       1.000      -1.000       1.000
b      -1.000       1.000      -1.000      -1.000       1.000       1.000      -1.000       1.000       1.000

+         i10

a       1.000
b       1.000


----     12 SET map  mapping between a and b

             i1          i2          i3          i4          i5          i6          i7          i8          i9

i1                      YES
i2          YES
i3                                                          YES
i4                                                                      YES
i5                                  YES
i6                                                                                  YES
i7                                                                                                          YES
i8                                              YES
i9                                                                                              YES

  +         i10

i10         YES

The data is organized so we can use the "start" formulation for both arrays in one swoop: \[\begin{align} & \mathit{start}_{k,i} \ge x_{k,i} - x_{k,i-1} && \forall k,i\\ &\sum_i \mathit{start}_{k,i} \le 1 && \forall k \\ & x_{k,i}, \mathit{start}_{k,i} \in \{0,1\} \end{align}\] Adding an index to a data-structure is a trick that can help preventing duplicating equations.

The objective is a bit complicated as only entries are counted when both siblings are selected. So, we can write: \[\sum_{i,j|\mathit{map}(i,j)} (d_{a,i}+d_{b,j})\cdot x_{a,i} \cdot x_{b,j} \] This is a quadratic objective. Some solvers can handle this directly. Note that we don't sum over all $(i,j)$ combinations but only over $(i,j)|\mathit{map}_{i,j}$, which has just 10 elements.

The complete model looks like:

MIQP formulation
\[\begin{align}\max\>& \color{darkred} z = \sum_{i,j\|\color{darkblue}{\mathit{map}}(i,j)} (\color{darkblue}d_{\color{darkgreen}a,i}+\color{darkblue}d_{\color{darkgreen}b,j}) \cdot \color{darkred}x_{\color{darkgreen}a,i}\cdot \color{darkred}x_{\color{darkgreen}b,j} \\ & \color{darkred}{\mathit{start}}_{k,i} \ge \color{darkred}x_{k,i} - \color{darkred}x_{k,i-1} && \forall k,i \\ & \sum_i \color{darkred}{\mathit{start}}_{k,i} \le 1 && \forall k\\ & \color{darkred}x_{k,i} \in \{0,1\} \\ & \color{darkred}{\mathit{start}}_{k,i} \in \{0,1\} \end{align}\]

MIQP formulation

When we run this, we may get the following solution:

----     39 VARIABLE x.L  select a number

           i1          i2          i3          i4          i5          i6          i7          i8          i9

a       1.000       1.000       1.000       1.000       1.000       1.000       1.000       1.000       1.000
b                                                       1.000       1.000       1.000       1.000       1.000

+         i10

a       1.000
b       1.000


----     39 VARIABLE start.L  start of subarray

           i1          i5

a       1.000
b                   1.000


----     39 VARIABLE z.L                   =        8.000  objective

This is a slightly different solution but with the same objective. One way to get the same solution as described in [1] is to add a second objective: minimize the number of selected numbers (i.e. a sparser solution). We can do this with a weighted objective: \[\max \sum_{i,j|\mathit{map}(i,j)} (d_{a,i}+d_{b,j})\cdot x_{a,i} \cdot x_{b,j} - \lambda \sum_{k,i} x_{k,i} \] for a small value for $\lambda$, e.g. $\lambda = 0.001$. When we run this, we see:

----     39 VARIABLE x.L  select a number

           i3          i4          i5          i6          i7          i8          i9         i10

a       1.000       1.000       1.000       1.000       1.000       1.000       1.000       1.000
b                               1.000       1.000       1.000       1.000       1.000       1.000


----     39 VARIABLE start.L  start of subarray

           i3          i5

a       1.000
b                   1.000


----     39 VARIABLE z.L                   =        7.986  objective

This corresponds to the solution reported by [1].

Maybe a better approach is to use a lexicographic model. In this case, we don't need a weight $\lambda$, but instead, solve two problems:

Solve the problem with objective \[\max\> z_1 = \sum_{i,j|\mathit{map}(i,j)} (d_{a,i}+d_{b,j})\cdot x_{a,i} \cdot x_{b,j} \]
Record the objective function value of the solution $z^*_1$.
Add the constraint \[ z^*_1 = \sum_{i,j|\mathit{map}(i,j)} (d_{a,i}+d_{b,j})\cdot x_{a,i} \cdot x_{b,j} \]
Solve the problem with objective \[\min\>z_2 = \sum_{k,i} x_{k,i}\]

A disadvantage is that we now have a quadratic constraint in the second problem.

It is not very difficult to linearize this problem. Notice that we only have $n$ products $x_{a,i} \cdot x_{b,j}$ as we need only to consider combinations $(i,j)|\mathit{map}_{i,j}$. We can write:

Linear formulation
\[\begin{align}\max\>& \color{darkred} z = \sum_{i,j\|\color{darkblue}{\mathit{map}}(i,j)} (\color{darkblue}d_{\color{darkgreen}a,i}+\color{darkblue}d_{\color{darkgreen}b,j}) \cdot \color{darkred}y_{i,j} \\ & \color{darkred}{\mathit{start}}_{k,i} \ge \color{darkred}x_{k,i} - \color{darkred}x_{k,i-1} && \forall k,i \\ & \sum_i \color{darkred}{\mathit{start}}_{k,i} \le 1 && \forall k \\ & \color{darkred}y_{i,j} \le \color{darkred}x_{\color{darkgreen}a,i} && \forall i,j\|\color{darkblue}{\mathit{map}}(i,j)\\& \color{darkred}y_{i,j} \le \color{darkred}x_{\color{darkgreen}b,j} && \forall i,j\|\color{darkblue}{\mathit{map}}(i,j) \\& \color{darkred}y_{i,j} \ge \color{darkred}x_{\color{darkgreen}a,i}+ \color{darkred}x_{\color{darkgreen}b,j} -1 && \forall i,j\|\color{darkblue}{\mathit{map}}(i,j) \\ & \color{darkred}x_{k,i} \in \{0,1\} \\ & \color{darkred}{\mathit{start}}_{k,i} \in \{0,1\} \\ & \color{darkred}y_{i,j} \in \{0,1\} \end{align}\]

Linear formulation

\[\begin{align}\max\>& \color{darkred} z = \sum_{i,j|\color{darkblue}{\mathit{map}}(i,j)} (\color{darkblue}d_{\color{darkgreen}a,i}+\color{darkblue}d_{\color{darkgreen}b,j}) \cdot \color{darkred}y_{i,j} \\ & \color{darkred}{\mathit{start}}_{k,i} \ge \color{darkred}x_{k,i} - \color{darkred}x_{k,i-1} && \forall k,i \\ & \sum_i \color{darkred}{\mathit{start}}_{k,i} \le 1 && \forall k \\ & \color{darkred}y_{i,j} \le \color{darkred}x_{\color{darkgreen}a,i} && \forall i,j|\color{darkblue}{\mathit{map}}(i,j)\\& \color{darkred}y_{i,j} \le \color{darkred}x_{\color{darkgreen}b,j} && \forall i,j|\color{darkblue}{\mathit{map}}(i,j) \\& \color{darkred}y_{i,j} \ge \color{darkred}x_{\color{darkgreen}a,i}+ \color{darkred}x_{\color{darkgreen}b,j} -1 && \forall i,j|\color{darkblue}{\mathit{map}}(i,j) \\ & \color{darkred}x_{k,i} \in \{0,1\} \\ & \color{darkred}{\mathit{start}}_{k,i} \in \{0,1\} \\ & \color{darkred}y_{i,j} \in \{0,1\} \end{align}\]

A larger data set

Let's create a larger data set:

50 random numbers for $a$ and $b$ (not restricted to -1, 1),
For simplicity use an identity mapping: $a_i$ is mapped to $b_i$

----     11 SET i  number of elements

i1 ,    i2 ,    i3 ,    i4 ,    i5 ,    i6 ,    i7 ,    i8 ,    i9 ,    i10,    i11,    i12,    i13,    i14,    i15
i16,    i17,    i18,    i19,    i20,    i21,    i22,    i23,    i24,    i25,    i26,    i27,    i28,    i29,    i30
i31,    i32,    i33,    i34,    i35,    i36,    i37,    i38,    i39,    i40,    i41,    i42,    i43,    i44,    i45
i46,    i47,    i48,    i49,    i50


----     11 SET k  array

a,    b


----     11 PARAMETER d  data arrays

           i1          i2          i3          i4          i5          i6          i7          i8          i9

a      -6.565       6.865       1.008      -3.977      -4.156      -5.519      -3.003       7.125      -8.658
b       3.222       5.116       2.549      -4.323      -8.272      -7.950       2.825       0.906      -9.370

+         i10         i11         i12         i13         i14         i15         i16         i17         i18

a       0.004       9.962       1.575       9.823       5.245      -7.386       2.794      -6.810      -4.998
b       5.847      -8.545      -6.487       0.513       5.004      -6.438      -9.317       1.703       2.425

+         i19         i20         i21         i22         i23         i24         i25         i26         i27

a       3.379      -1.293      -2.806      -2.971      -7.370      -6.998       1.782       6.618      -5.384
b      -2.213      -2.826      -5.139      -5.072      -7.390       8.669      -2.401       5.668      -3.999

+         i28         i29         i30         i31         i32         i33         i34         i35         i36

a       3.315       5.517      -3.927      -7.790       0.048      -6.797       7.449      -4.698      -4.284
b      -7.490       4.977      -8.615      -5.960      -9.899      -4.608      -0.003      -6.974      -6.517

+         i37         i38         i39         i40         i41         i42         i43         i44         i45

a       1.879       4.454       2.565      -0.724      -1.734      -7.646      -3.716      -9.069      -3.229
b      -3.387      -3.662      -3.558       9.280       9.872      -2.602      -2.542       5.440      -2.066

+         i46         i47         i48         i49         i50

a      -6.358       2.915       1.215       5.399      -4.044
b       8.262      -7.608       4.710      -8.892       1.526


----     11 SET map  mapping between a and b

             i1          i2          i3          i4          i5          i6          i7          i8          i9

i1          YES
i2                      YES
i3                                  YES
i4                                              YES
i5                                                          YES
i6                                                                      YES
i7                                                                                  YES
i8                                                                                              YES
i9                                                                                                          YES

  +         i10         i11         i12         i13         i14         i15         i16         i17         i18

i10         YES
i11                     YES
i12                                 YES
i13                                             YES
i14                                                         YES
i15                                                                     YES
i16                                                                                 YES
i17                                                                                             YES
i18                                                                                                         YES

  +         i19         i20         i21         i22         i23         i24         i25         i26         i27

i19         YES
i20                     YES
i21                                 YES
i22                                             YES
i23                                                         YES
i24                                                                     YES
i25                                                                                 YES
i26                                                                                             YES
i27                                                                                                         YES

  +         i28         i29         i30         i31         i32         i33         i34         i35         i36

i28         YES
i29                     YES
i30                                 YES
i31                                             YES
i32                                                         YES
i33                                                                     YES
i34                                                                                 YES
i35                                                                                             YES
i36                                                                                                         YES

  +         i37         i38         i39         i40         i41         i42         i43         i44         i45

i37         YES
i38                     YES
i39                                 YES
i40                                             YES
i41                                                         YES
i42                                                                     YES
i43                                                                                 YES
i44                                                                                             YES
i45                                                                                                         YES

  +         i46         i47         i48         i49         i50

i46         YES
i47                     YES
i48                                 YES
i49                                             YES
i50                                                         YES

The solution is found quickly using a good solver. E.g. Cplex needs 0 nodes to solve this to optimality: it finds the optimal solution during preprocessing.

----     44 VARIABLE x.L  select a number

          i10         i11         i12         i13         i14         i15         i16

a       1.000       1.000       1.000       1.000       1.000       1.000       1.000
b       1.000       1.000       1.000       1.000       1.000


----     44 VARIABLE start.L  start of subarray

          i10

a       1.000
b       1.000


----     44 VARIABLE y.L  x('a',i)*x('b',j) for map(i,j)

            i10         i11         i12         i13         i14

i10       1.000
i11                   1.000
i12                               1.000
i13                                           1.000
i14                                                       1.000


----     44 VARIABLE z.L                   =       22.941  objective

We see that $a$ has more selected numbers than $b$, The extra ones have no value, but do not incur a cost either. If we use a lexicographic approach, we can find the sparsest optimal solution:

----     55 VARIABLE x.L  select a number

          i10         i11         i12         i13         i14

a       1.000       1.000       1.000       1.000       1.000
b       1.000       1.000       1.000       1.000       1.000


----     55 VARIABLE start.L  start of subarray

          i10

a       1.000
b       1.000


----     55 VARIABLE y.L  x('a',i)*x('b',j) for map(i,j)

            i10         i11         i12         i13         i14

i10       1.000
i11                   1.000
i12                               1.000
i13                                           1.000
i14                                                       1.000


----     55 VARIABLE z.L                   =       22.941  objective
            VARIABLE z2.L                  =       10.000  second level objective: number of elements selected

With the linearized model, we do not generate a quadratic constraint when using the lexicographic algorithm. This removes a complication. The implementation is shown below.

GAMS model

The GAMS implementation of the last model (lexicographic, linearized) can look like:

set
i 'number of elements' /i1*i50/
k 'array' /a, b/
;
alias(i,j);

parameter d(k, i) 'data arrays';
d(k,i) = uniform(-10,10);

set map(i,j) 'mapping between a and b';
map(i,i) = yes;

display i,k,d,map;

binary variables
   x(k,i) 'select a number'
   start(k,i) 'start of subarray'
   y(i,j) "x('a',i)*x('b',j) for map(i,j)"
;

variable z 'objective: total value';

equations
obj   'level 1 obj: total value'
mul1(i,j) 'linearization'
mul2(i,j) 'linearization'
mul3(i,j) 'linearization'
startbnd(k,i) 'bound on start variable'
sumstart(k)   'allow only one start'
;

obj.. z =e= sum(map(i,j), (d('a',i)+d('b',j))*y(i,j));
mul1(map(i,j)).. y(i,j) =l= x('a',i);
mul2(map(i,j)).. y(i,j) =l= x('b',j);
mul3(map(i,j)).. y(i,j) =g= x('a',i)+x('b',j)-1;
startbnd(k,i).. start(k,i) =g= x(k,i)-x(k,i-1);
sumstart(k).. sum(i, start(k,i)) =l= 1;

model m /all/;
option optcr=0;
solve m maximizing z using mip;

display x.l, start.l, y.l, z.l;

* fix z to its optimal value
z.fx = z.l;

equation obj2 'second level obj: sparsity';
variable z2 'second level objective: number of elements selected';

obj2.. z2 =e= sum((k,i), x(k,i));

model m2 /all/;
solve m2 minimizing z2 using mip;
display x.l, start.l, y.l, z.l, z2.l;

More optimization

There is one more optimization we can use. If the objective coefficient $(d_{a,i}+d_{b,j})$ is positive the model will push $y_{i,j}$ upwards, so we can drop the constraints $y_{i,j} \ge x_{a,i}+ x_{b,j} -1$. If on the other hand, $(d_{a,i}+d_{b,j})$ is negative the model will push $y_{i,j}$ downwards to zero. That means we can drop the constraints: $y_{i,j} \le x_{b,j}$ and $y_{i,j} \le x_{b,j}$, Implementing this can look like: \[\begin{align}& y_{i,j} \le x_{a,i} && \forall i,j|\mathit{map}(i,j), d_{a,i}+d_{b,j} \gt 0\\& y_{i,j} \le x_{b,j} && \forall i,j|\mathit{map}(i,j), d_{a,i}+d_{b,j} \gt 0 \\& y_{i,j} \ge x_{a,i}+ x_{b,j} -1 && \forall i,j|\mathit{map}(i,j), d_{a,i}+d_{b,j} \lt 0\end{align}\] For the special case when $d_{a,i}+d_{b,j}=0$ we can remove the corresponding $y_{i,j}$ from the model. For very large data sets this can make a difference.

A second thing to look at is the type of variables $y$ and $\mathit{start}$. Instead of being a binary variable, we can relax them to be continuous between 0 and 1.

I did some limited tests, and it looks that we are not really helping Cplex with these tricks. For other solvers, things may be different.

References

Is this a graph problem or something else?, https://stackoverflow.com/questions/63444355/is-this-a-graph-problem-or-something-else
Maximum subarray problem, https://en.wikipedia.org/wiki/Maximum_subarray_problem

Tuesday, August 11, 2020

Largest Rectangular Area in a Histogram

Consider a histogram or bar chart. The problem is to find the largest rectangular area covered by the histogram. A picture will help:

Largest rectangle in a histogram

There are a number of algorithms to attack this problem, some of them very efficient, e.g. $O(n)$. [1,2]

Here we try to find out if we can formulate this as an optimization model.

A small data set

Here is a small data set. It corresponds to the picture above.

----     13 SET i  bars

i1,    i2,    i3,    i4,    i5,    i6,    i7


----     13 PARAMETER hist  histogram data

i1 6.000,    i2 2.000,    i3 5.000,    i4 4.000,    i5 5.000,    i6 1.000,    i7 6.000


----     13 PARAMETER maxhist              =        6.000  heighest bar

A quadratic model

We start with our central binary variable \[\mathit{select}_i = \begin{cases} 1 & \text{if bar $i$ is selected to be part of the rectangle} \\ 0 & \text{otherwise}\end{cases}\] In addition, we want to know if an unselected bar is left or right of the selected bars. For this we use a binary variable \[\mathit{lr}_i = \begin{cases} 0 & \text{if unselected bar $i$ is to the left of the selected bars} \\ 1& \text{if unselected bar $i$ is to the right of the selected bars} \\ \text{unrestricted} & \text{if $\mathit{select}_i=1$}\end{cases}\] For the above picture, this looks like:

Note that the variable $\mathit{lr}_i$ can be either 0 or 1 when the corresponding bar is selected.

The variables to describe the rcctangular area are: $\mathit{start}$, $\mathit{width}$ and $\mathit{height}$.

The following implications can be formed. If bar $i$ is not selected and to the left, then $\mathit{start}\gt i$: \[\mathit{select}_i=0 \>{\bf{and}}\> \mathit{lr}_i=0 \Rightarrow \mathit{start}\ge i+1\] Similarly, if bar $i$ is not selected and to the right, then $\mathit{start}+\mathit{width}-1 \lt i$. Formally: \[\mathit{select}_i=0 \>{\bf{and}}\> \mathit{lr}_i=1 \Rightarrow \mathit{start}+\mathit{width}-1\le i-1\] Furthermore, we also have the implication that limits the height: if bar $i$ is selected, then $\mathit{height}\le \mathit{hist}_i$. These implications can be implemented as big-M constraints. Finally, we can link $\mathit{select}_i$ to the width, by saying: \[\sum_i \mathit{select}_i = \mathit{width}\] The complete model can look like:

Quadratic Model
\[\begin{align}\max&\>\color{darkred}{\mathit{area}}=\color{darkred}{\mathit{width}}\cdot\color{darkred}{\mathit{height}}\\ & \color{darkred}{\mathit{start}} \ge\color{darkblue}i+1-\color{darkblue}n\cdot \color{darkred}{\mathit{select}}_i -\color{darkblue}n\cdot\color{darkred}{\mathit{lr}}_i&& \forall i \\ & \color{darkred}{\mathit{start}}+\color{darkred}{\mathit{width}} \le \color{darkblue} i+\color{darkblue}n\cdot \color{darkred}{\mathit{select}}_i+\color{darkblue}n\cdot (1-\color{darkred}{\mathit{lr}}_i) && \forall i \\ & \sum_i \color{darkred}{\mathit{select}}_i = \color{darkred}{\mathit{width}} \\ & \color{darkred}{\mathit{height}} \le \color{darkblue} {\mathit{hist}}_i + (\color{darkblue}{\mathit{maxhist}} - \color{darkblue} {\mathit{hist}}_i) \cdot (1- \color{darkred}{\mathit{select}}_i) && \forall i \\ & \color{darkred}{\mathit{start}}+\color{darkred}{\mathit{width}}-1 \le \color{darkblue}n \\ & \color{darkred}{\mathit{select}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{lr}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{start}} \in [1,\color{darkblue}n]\\ & \color{darkred}{\mathit{width}} \in [1,\color{darkblue}n]\\ & \color{darkred}{\mathit{height}} \ge 0\\ & \color{darkblue}n = {\bf{card}}(i) \end{align}\]

Quadratic Model

\[\begin{align}\max&\>\color{darkred}{\mathit{area}}=\color{darkred}{\mathit{width}}\cdot\color{darkred}{\mathit{height}}\\ & \color{darkred}{\mathit{start}} \ge\color{darkblue}i+1-\color{darkblue}n\cdot \color{darkred}{\mathit{select}}_i -\color{darkblue}n\cdot\color{darkred}{\mathit{lr}}_i&& \forall i \\ & \color{darkred}{\mathit{start}}+\color{darkred}{\mathit{width}} \le \color{darkblue} i+\color{darkblue}n\cdot \color{darkred}{\mathit{select}}_i+\color{darkblue}n\cdot (1-\color{darkred}{\mathit{lr}}_i) && \forall i \\ & \sum_i \color{darkred}{\mathit{select}}_i = \color{darkred}{\mathit{width}} \\ & \color{darkred}{\mathit{height}} \le \color{darkblue} {\mathit{hist}}_i + (\color{darkblue}{\mathit{maxhist}} - \color{darkblue} {\mathit{hist}}_i) \cdot (1- \color{darkred}{\mathit{select}}_i) && \forall i \\ & \color{darkred}{\mathit{start}}+\color{darkred}{\mathit{width}}-1 \le \color{darkblue}n \\ & \color{darkred}{\mathit{select}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{lr}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{start}} \in [1,\color{darkblue}n]\\ & \color{darkred}{\mathit{width}} \in [1,\color{darkblue}n]\\ & \color{darkred}{\mathit{height}} \ge 0\\ & \color{darkblue}n = {\bf{card}}(i) \end{align}\]

This is a non-convex MIQP (Mixed Integer Quadratic Programming) model and can be solved with solvers like Baron, Couenne, Antigone, Cplex, and Gurobi. Note that Cplex and Gurobi need a special option to allow these models to be solved: optimalityTarget=3 for Cplex and nonConvex=2 for Gurobi.

Results

After running the model we see:

----     68 VARIABLE select.L  bar is selected

i3 1.000,    i4 1.000,    i5 1.000


----     68 VARIABLE lr.L   unselected bars are left (lr=0) or right (lr=1) of selection

i5 1.000,    i6 1.000,    i7 1.000


----     68 VARIABLE start.L               =        3.000  first selected bar
            VARIABLE width.L               =        3.000  number of selected bars
            VARIABLE height.L              =        4.000  shortest selected bar
            VARIABLE area.L                =       12.000  width*height

Note that zero values are not printed.

A larger data set

To test if this formulation work with a larger data set, I generated some random data:

----     12 SET i  bars

i1  ,    i2  ,    i3  ,    i4  ,    i5  ,    i6  ,    i7  ,    i8  ,    i9  ,    i10 ,    i11 ,    i12 ,    i13 
i14 ,    i15 ,    i16 ,    i17 ,    i18 ,    i19 ,    i20 ,    i21 ,    i22 ,    i23 ,    i24 ,    i25 ,    i26 
i27 ,    i28 ,    i29 ,    i30 ,    i31 ,    i32 ,    i33 ,    i34 ,    i35 ,    i36 ,    i37 ,    i38 ,    i39 
i40 ,    i41 ,    i42 ,    i43 ,    i44 ,    i45 ,    i46 ,    i47 ,    i48 ,    i49 ,    i50 ,    i51 ,    i52 
i53 ,    i54 ,    i55 ,    i56 ,    i57 ,    i58 ,    i59 ,    i60 ,    i61 ,    i62 ,    i63 ,    i64 ,    i65 
i66 ,    i67 ,    i68 ,    i69 ,    i70 ,    i71 ,    i72 ,    i73 ,    i74 ,    i75 ,    i76 ,    i77 ,    i78 
i79 ,    i80 ,    i81 ,    i82 ,    i83 ,    i84 ,    i85 ,    i86 ,    i87 ,    i88 ,    i89 ,    i90 ,    i91 
i92 ,    i93 ,    i94 ,    i95 ,    i96 ,    i97 ,    i98 ,    i99 ,    i100


----     12 PARAMETER hist  histogram data

i1   17.175,    i2   84.327,    i3   55.038,    i4   30.114,    i5   29.221,    i6   22.405,    i7   34.983
i8   85.627,    i9    6.711,    i10  50.021,    i11  99.812,    i12  57.873,    i13  99.113,    i14  76.225
i15  13.069,    i16  63.972,    i17  15.952,    i18  25.008,    i19  66.893,    i20  43.536,    i21  35.970
i22  35.144,    i23  13.149,    i24  15.010,    i25  58.911,    i26  83.089,    i27  23.082,    i28  66.573
i29  77.586,    i30  30.366,    i31  11.049,    i32  50.238,    i33  16.017,    i34  87.246,    i35  26.511
i36  28.581,    i37  59.396,    i38  72.272,    i39  62.825,    i40  46.380,    i41  41.331,    i42  11.770
i43  31.421,    i44   4.655,    i45  33.855,    i46  18.210,    i47  64.573,    i48  56.075,    i49  76.996
i50  29.781,    i51  66.111,    i52  75.582,    i53  62.745,    i54  28.386,    i55   8.642,    i56  10.251
i57  64.125,    i58  54.531,    i59   3.152,    i60  79.236,    i61   7.277,    i62  17.566,    i63  52.563
i64  75.021,    i65  17.812,    i66   3.414,    i67  58.513,    i68  62.123,    i69  38.936,    i70  35.871
i71  24.303,    i72  24.642,    i73  13.050,    i74  93.345,    i75  37.994,    i76  78.340,    i77  30.003
i78  12.548,    i79  74.887,    i80   6.923,    i81  20.202,    i82   0.507,    i83  26.961,    i84  49.985
i85  15.129,    i86  17.417,    i87  33.064,    i88  31.691,    i89  32.209,    i90  96.398,    i91  99.360
i92  36.990,    i93  37.289,    i94  77.198,    i95  39.668,    i96  91.310,    i97  11.958,    i98  73.548
i99   5.542,    i100 57.630


----     12 PARAMETER maxhist              =       99.812  heighest bar

When we solve this we see the following results:

----     67 VARIABLE select.L  selection of bar

i10 1.000,    i11 1.000,    i12 1.000,    i13 1.000,    i14 1.000,    i15 1.000,    i16 1.000,    i17 1.000
i18 1.000,    i19 1.000,    i20 1.000,    i21 1.000,    i22 1.000,    i23 1.000,    i24 1.000,    i25 1.000
i26 1.000,    i27 1.000,    i28 1.000,    i29 1.000,    i30 1.000,    i31 1.000,    i32 1.000,    i33 1.000
i34 1.000,    i35 1.000,    i36 1.000,    i37 1.000,    i38 1.000,    i39 1.000,    i40 1.000,    i41 1.000
i42 1.000,    i43 1.000


----     67 VARIABLE lr.L   unselected bars are left (lr=0) or right (lr=1) of selection

i10  1.000,    i11  1.000,    i13  1.000,    i14  1.000,    i16  1.000,    i19  1.000,    i24  1.000,    i30  1.000
i35  1.000,    i36  1.000,    i37  1.000,    i40  1.000,    i42  1.000,    i44  1.000,    i45  1.000,    i46  1.000
i47  1.000,    i48  1.000,    i49  1.000,    i50  1.000,    i51  1.000,    i52  1.000,    i53  1.000,    i54  1.000
i55  1.000,    i56  1.000,    i57  1.000,    i58  1.000,    i59  1.000,    i60  1.000,    i61  1.000,    i62  1.000
i63  1.000,    i64  1.000,    i65  1.000,    i66  1.000,    i67  1.000,    i68  1.000,    i69  1.000,    i70  1.000
i71  1.000,    i72  1.000,    i73  1.000,    i74  1.000,    i75  1.000,    i76  1.000,    i77  1.000,    i78  1.000
i79  1.000,    i80  1.000,    i81  1.000,    i82  1.000,    i83  1.000,    i84  1.000,    i85  1.000,    i86  1.000
i87  1.000,    i88  1.000,    i89  1.000,    i90  1.000,    i91  1.000,    i92  1.000,    i93  1.000,    i94  1.000
i95  1.000,    i96  1.000,    i97  1.000,    i98  1.000,    i99  1.000,    i100 1.000


----     67 VARIABLE start.L               =       10.000  first selected bar
            VARIABLE width.L               =       34.000  number of selected bars
            VARIABLE height.L              =       11.049  shortest selected bar
            VARIABLE area.L                =      375.674  width*height

Cplex needed 200 seconds for this. We can help Cplex a bit by adding the ordering constraint: \[\mathit{lr}_i \ge \mathit{lr}_{i-1}\] This has a significant impact:

	original formulation	with ordering constraint
Objective	375.674	375.674
Time (s)	210	24
Nodes	95,830	6,040
Iterations	2,190,751	153,255

A simpler model

There is a simpler approach. Instead of a scalar variable $\mathit{start}$ we use an indexed variable, indicating where we switch from $\mathit{select}_{i-1}=0$ to $\mathit{select}_{i}=1$. In our picture:

We want contiguous selected bars, or in other words: only a single start variable can be 1. This can be modeled as \[\begin{align} & \mathit{start}_i \ge \mathit{select}_i-\mathit{select}_{i-1} \\ &\sum_i \mathit{start}_i \le 1 \end{align}\] The first inequality can be interpreted as the following implication: \[\mathit{select}_{i-1}=0 \>{\bf{and}}\>\mathit{select}_{i}=1 \Rightarrow \mathit{start}_i=1 \] Note, that $\mathit{start}_i$ is unrestricted in all other cases. This formulation is not uncommon in machine scheduling models where we want to limit the number start-ups.

With this we can state the following quadratic model:

Quadratic Model
\[\begin{align}\max&\>\color{darkred}{\mathit{area}}=\sum_i \color{darkred}{\mathit{selected}}_i \cdot \color{darkred}{\mathit{height}}\\ & \color{darkred}{\mathit{start}}_i \ge \color{darkred}{\mathit{select}}_i - \color{darkred}{\mathit{select}}_{i-1} && \forall i \\ & \sum_i \color{darkred}{\mathit{start}}_i \le 1 \\ & \color{darkred}{\mathit{select}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{height}} \le \color{darkblue} {\mathit{hist}}_i + (\color{darkblue}{\mathit{maxhist}} - \color{darkblue} {\mathit{hist}}_i) \cdot (1- \color{darkred}{\mathit{select}}_i) && \forall i \\ & \color{darkred}{\mathit{start}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{height}} \ge 0\\ \end{align}\]

Quadratic Model

\[\begin{align}\max&\>\color{darkred}{\mathit{area}}=\sum_i \color{darkred}{\mathit{selected}}_i \cdot \color{darkred}{\mathit{height}}\\ & \color{darkred}{\mathit{start}}_i \ge \color{darkred}{\mathit{select}}_i - \color{darkred}{\mathit{select}}_{i-1} && \forall i \\ & \sum_i \color{darkred}{\mathit{start}}_i \le 1 \\ & \color{darkred}{\mathit{select}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{height}} \le \color{darkblue} {\mathit{hist}}_i + (\color{darkblue}{\mathit{maxhist}} - \color{darkblue} {\mathit{hist}}_i) \cdot (1- \color{darkred}{\mathit{select}}_i) && \forall i \\ & \color{darkred}{\mathit{start}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{height}} \ge 0\\ \end{align}\]

This model can be linearized as we multiply a binary variable with a continuous variable [3]. The multiplication \[\begin{align} &y = \delta \cdot x \\ & \delta \in \{0,1\} \\ & x \in [0,U] \end{align}\] can be linearized as: \[\begin{align} & y \le U \cdot \delta\\ & y \le x \\& y \ge x - U \cdot (1-\delta) \\ & \delta \in \{0,1\} \\ & x,y \in [0,U] \end{align}\]

After a little bit of algebra (we have the complication that the variable $\mathit{height}$ can exceed $\mathit{hist}_i$ for non-selected bars), we arrive at the following linear model:

Linear Model
\[\begin{align}\max&\>\color{darkred}{\mathit{area}}=\sum_i \color{darkred}{\mathit{adjHeight}}_i \\ & \color{darkred}{\mathit{start}}_i \ge \color{darkred}{\mathit{select}}_i - \color{darkred}{\mathit{select}}_{i-1} && \forall i \\ & \sum_i \color{darkred}{\mathit{start}}_i \le 1 \\ & \color{darkred}{\mathit{adjHeight}}_i \le \color{darkred}{\mathit{select}}_i \cdot \color{darkblue} {\mathit{hist}}_i && \forall i \\ & \color{darkred}{\mathit{adjHeight}}_i \le \color{darkred}{\mathit{height}} && \forall i \\ & \color{darkred}{\mathit{adjHeight}}_i \ge \color{darkred}{\mathit{height}} - \color{darkblue}{\mathit{maxhist}} \cdot (1- \color{darkred}{\mathit{select}}_i) && \forall i \\ & \color{darkred}{\mathit{select}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{start}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{height}} \ge 0\\ & \color{darkred}{\mathit{adjHeight}}_i \ge 0 \\ \end{align}\]

Linear Model

\[\begin{align}\max&\>\color{darkred}{\mathit{area}}=\sum_i \color{darkred}{\mathit{adjHeight}}_i \\ & \color{darkred}{\mathit{start}}_i \ge \color{darkred}{\mathit{select}}_i - \color{darkred}{\mathit{select}}_{i-1} && \forall i \\ & \sum_i \color{darkred}{\mathit{start}}_i \le 1 \\ & \color{darkred}{\mathit{adjHeight}}_i \le \color{darkred}{\mathit{select}}_i \cdot \color{darkblue} {\mathit{hist}}_i && \forall i \\ & \color{darkred}{\mathit{adjHeight}}_i \le \color{darkred}{\mathit{height}} && \forall i \\ & \color{darkred}{\mathit{adjHeight}}_i \ge \color{darkred}{\mathit{height}} - \color{darkblue}{\mathit{maxhist}} \cdot (1- \color{darkred}{\mathit{select}}_i) && \forall i \\ & \color{darkred}{\mathit{select}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{start}}_i \in \{0,1\} \\ & \color{darkred}{\mathit{height}} \ge 0\\ & \color{darkred}{\mathit{adjHeight}}_i \ge 0 \\ \end{align}\]

In this model, we use \[\mathit{adjHeight}_i = \begin{cases} \mathit{height} & \text{if $\mathit{select}_i=1$} \\ 0 & \text{if $\mathit{select}_i=0$}\end{cases}\]

The solution of the small data set for this model looks like:

----     50 VARIABLE select.L  bar is selected

i3 1.000,    i4 1.000,    i5 1.000


----     50 VARIABLE start.L  first selected bar

i3 1.000


----     50 VARIABLE adjHeight.L  adjusted height: heigth*selected(i)

i3 4.000,    i4 4.000,    i5 4.000


----     50 VARIABLE height.L              =        4.000  area height
            VARIABLE area.L                =       12.000

Of course, we also ran it with the large, random data set. The solution looks like:

----     49 VARIABLE select.L  bar is selected

i10 1.000,    i11 1.000,    i12 1.000,    i13 1.000,    i14 1.000,    i15 1.000,    i16 1.000,    i17 1.000
i18 1.000,    i19 1.000,    i20 1.000,    i21 1.000,    i22 1.000,    i23 1.000,    i24 1.000,    i25 1.000
i26 1.000,    i27 1.000,    i28 1.000,    i29 1.000,    i30 1.000,    i31 1.000,    i32 1.000,    i33 1.000
i34 1.000,    i35 1.000,    i36 1.000,    i37 1.000,    i38 1.000,    i39 1.000,    i40 1.000,    i41 1.000
i42 1.000,    i43 1.000


----     49 VARIABLE start.L  first selected bar

i10 1.000


----     49 VARIABLE adjHeight.L  adjusted height: heigth*selected(i)

i10 11.049,    i11 11.049,    i12 11.049,    i13 11.049,    i14 11.049,    i15 11.049,    i16 11.049,    i17 11.049
i18 11.049,    i19 11.049,    i20 11.049,    i21 11.049,    i22 11.049,    i23 11.049,    i24 11.049,    i25 11.049
i26 11.049,    i27 11.049,    i28 11.049,    i29 11.049,    i30 11.049,    i31 11.049,    i32 11.049,    i33 11.049
i34 11.049,    i35 11.049,    i36 11.049,    i37 11.049,    i38 11.049,    i39 11.049,    i40 11.049,    i41 11.049
i42 11.049,    i43 11.049


----     49 VARIABLE height.L              =       11.049  area height
            VARIABLE area.L                =      375.674

To compare the performance, let's add the statistics to our table:

	MIQP original formulation	MIQP with ordering constraint	Linear MIP
Objective	375.674	375.674	375.674
Time (s)	210	24	0.8
Nodes	95,830	6,040	0
Iterations	2,190,751	153,255	589

Cplex was able to solve the MIP model to optimality during pre-processing, so no branch-and-bound nodes were needed.

Conclusion

The Largest Rectangular Area in a Histogram is easy to explain. Writing down the corresponding optimization model is a bit more difficult. In addition, different formulations can lead to dramatic differences in performance. Clearly the linearized model is the winner.

References

Largest rectangular area in a histogram, set 1, https://www.geeksforgeeks.org/largest-rectangular-area-in-a-histogram-set-1/
Largest rectangular area in a histogram, set 2, https://www.geeksforgeeks.org/largest-rectangle-under-histogram/
Multiplication of a continuous and a binary variable, http://yetanothermathprogrammingconsultant.blogspot.com/2008/05/multiplication-of-continuous-and-binary.html