Friday, June 19, 2020

Small example: ordering of binary variables


Problem description


Consider the vector \[v = (1, 3, 6, 4, 7, 9, 6, 2, 3) \] Partition \(v\) into two consecutive subsets, such that the difference of the sum of the values in each subset is minimized. More formally, find the index \(k\) such that \[z = \left| \sum_{i \lt k} v_i - \sum_{i \ge k} v_i \right|\] is minimized. I.e. the sum left of the break point is as close as possible to the sum right of it.

This is easily programmed in code. However, the question is, can we do this in the form of an optimization problem? [1]

Model formulation


The trick is to have a binary variable \(\delta_i \in \{0,1\}\) associated with each observation \(v_i\), \(i=1,\dots, n\). The interpretation is: \[ \delta_i = \begin{cases} 0 & \text{if $v_i$ belongs to the first subset}\\ 1 & \text{if $v_i$ belongs to the second subset}\end{cases}\] To make sure all \(\delta_i = 0\) are to the left and all all \(\delta_i = 1\) are to the right of the break point, we require \[ \delta_{i+1} \ge \delta_{i}\] This implies we start with \(\delta_i = 0\) and when we see \(\delta_i = 1\) it will never go back to zero afterwards. The high-level model can look like:


Integer-programming model
\[\begin{align}\min\>& \color{darkred} z= |\color{darkred}y_{\mathit{left}} - \color{darkred}y_{\mathit{right}}| \\ &\color{darkred}y_{\mathit{left}} = \sum_i (1-\color{darkred}\delta_i) \color{darkblue}v_i \\ &\color{darkred}y_{\mathit{right}} = \sum_i \color{darkred}\delta_i \color{darkblue}v_i \\ & \color{darkred}k = 1 + \sum_i (1-\color{darkred}\delta_i) \\ & \color{darkred}\delta_{i+1} \ge \color{darkred}\delta_{i} \\ & \color{darkred}\delta_{i} \in \{0,1\} \end{align}\]

Notes:

  • The expression \( \sum_i (1-\delta_i) v_i\), sums up \(v_i\) where \(\delta_i=0\), while \( \sum_i \delta_i v_i\), adds up \(v_i\) where \(\delta_i=1\). 
  • Make sure the ordering constraint does not go over the boundary: the constraint \(\delta_{i+1} \ge \delta_{i}\) should only hold for \(i=1,\dots,n-1\).  
  • Obviously, we can also write: \(\delta_{i-1} \le \delta_{i}\). 
  • The absolute value can be linearized by formulating it as \[\begin{align} \min\> & z\\ & -z \le y_{\mathit{left}} - y_{\mathit{right}} \le z\\ & z\ge 0\end{align}\] 
  • It is noted, that there are no big-M constraints in the model. As we shall see later, this model solved very quickly.


The results look like:


----     50 PARAMETER results  

               i1          i2          i3          i4          i5          i6          i7          i8          i9

v               1           3           6           4           7           9           6           2           3
delta                                                                       1           1           1           1


----     50 VARIABLE y.L  sum of partitions

left  21.000,    right 20.000


----     50 VARIABLE k.L                   =            6  breakpoint
            VARIABLE z.L                   =            1  objective


The zeros are not printed.

Of course, the model works for larger sets, and in cases where \(v_i\) are floating-point numbers. They may also contain negative numbers. An example is:


----     51 PARAMETER v  data

i1   -65.651,    i2    68.653,    i3    10.075,    i4   -39.772,    i5   -41.558,    i6   -55.189,    i7   -30.034
i8    71.254,    i9   -86.577,    i10    0.042,    i11   99.624,    i12   15.747,    i13   98.227,    i14   52.450
i15  -73.862,    i16   27.944,    i17  -68.096,    i18  -49.984,    i19   33.786,    i20  -12.929,    i21  -28.060
i22  -29.712,    i23  -73.702,    i24  -69.980,    i25   17.823,    i26   66.179,    i27  -53.837,    i28   33.147
i29   55.172,    i30  -39.268,    i31  -77.902,    i32    0.477,    i33  -67.965,    i34   74.492,    i35  -46.977
i36  -42.837,    i37   18.791,    i38   44.544,    i39   25.650,    i40   -7.240,    i41  -17.339,    i42  -76.461
i43  -37.158,    i44  -90.690,    i45  -32.290,    i46  -63.580,    i47   29.145,    i48   12.149,    i49   53.992
i50  -40.439,    i51   32.221,    i52   51.164,    i53   25.489,    i54  -43.227,    i55  -82.715,    i56  -79.497
i57   28.250,    i58    9.062,    i59  -93.695,    i60   58.472,    i61  -85.447,    i62  -64.868,    i63    5.127
i64   50.042,    i65  -64.375,    i66  -93.172,    i67   17.026,    i68   24.246,    i69  -22.128,    i70  -28.257
i71  -51.393,    i72  -50.716,    i73  -73.899,    i74   86.690,    i75  -24.012,    i76   56.680,    i77  -39.993
i78  -74.903,    i79   49.775,    i80  -86.154,    i81  -59.597,    i82  -98.987,    i83  -46.077,    i84   -0.030
i85  -69.743,    i86  -65.166,    i87  -33.872,    i88  -36.619,    i89  -35.583,    i90   92.795,    i91   98.720
i92  -26.019,    i93  -25.422,    i94   54.396,    i95  -20.663,    i96   82.619,    i97  -76.084,    i98   47.096
i99  -88.916,    i100  15.260


----     51 VARIABLE delta.L  ordered binary variables

i62  1,    i63  1,    i64  1,    i65  1,    i66  1,    i67  1,    i68  1,    i69  1,    i70  1,    i71  1,    i72  1
i73  1,    i74  1,    i75  1,    i76  1,    i77  1,    i78  1,    i79  1,    i80  1,    i81  1,    i82  1,    i83  1
i84  1,    i85  1,    i86  1,    i87  1,    i88  1,    i89  1,    i90  1,    i91  1,    i92  1,    i93  1,    i94  1
i95  1,    i96  1,    i97  1,    i98  1,    i99  1,    i100 1


----     51 VARIABLE y.L  sum of partitions

left  -689.647,    right -676.178


----     51 VARIABLE k.L                   =           62  breakpoint
            VARIABLE z.L                   =       13.469  objective

This model with \(n=100\) solves instantaneously. For \(n=1,000\) this becomes a second or so.


The ordering constraint is a useful construct to have in your toolbox. This is used in other contexts as well. It can be used to order solutions for better display and interpretation. It can also be used to reduce symmetry in certain models.

GAMS implementation


Here follows the GAMS model. Note the handling of the order constraint with the lagged variable reference. This can be debugged (or verified its correctness) by inspecting the Equation Listing (after increasing option limrow).

set
   i
'observations' /i1*i9/
   j
'partitions' /left,right/
;

parameter v 'data' /i1 1, i2 3, i3 6, i4 4, i5 7, i6 9, i7 6, i8 2, i9 3/;
display v;

binary variable delta(i) 'ordered binary variables';
variable
   z    
'objective'
   y(j) 
'sum of partitions'
   k    
'breakpoint'
;

equations
   obj1     
'linearized objective'
   obj2     
'linearized objective'
   ydef1    
'y(left)'
   ydef2    
'y(right)'
   order(i) 
'ordering of binary variables'
   kdef     
'breakpoint'
;

*
* objective: min abs(y('left')-y('right'))
* we linearize this
*
obj1.. -z =l= y(
'left')-y('right');
obj2..        y(
'left')-y('right') =l= z;
z.lo = 0;

*
*  calculate y
*
ydef1.. y(
'left') =e= sum(i, (1-delta(i))*v(i));
ydef2.. y(
'right') =e= sum(i, delta(i)*v(i));

*
* ordering of binary variables
* check the equation listing (after using option limrow)
*
option limrow=10;
order(i-1).. delta(i) =g= delta(i-1);

*
* breakpoint
*
kdef.. k =e= 1+
sum(i,(1-delta(i)));

model m /all/;
option optcr=0;
solve m minimizing z using mip;

*
* print results
*

parameter results(*,*);
results(
'v',i) = v(i);
results(
'delta',i) = delta.l(i);

option results:0,k:0,z:0;
display
     results,
     y.l,
     k.l,
     z.l
;



References


  1. Optimize with indexing in linear programming, https://stackoverflow.com/questions/62432731/optimize-with-indexing-in-linear-programming

5 comments:

  1. Hi Erwin, could you share your GAMS codes? It would be helpful for people like me to learn GAMS. Thank you!

    ReplyDelete
    Replies
    1. Hi Erwin, thank you for posting your GAMS model. I also wrote an GAMS model to solve it. https://github.com/jjjch/GAMS/blob/master/ordering_of_binary_variables.gms


      Your handling of "z.lo = 0;" and "order(i-1).. delta(i) =g= delta(i-1);" are better. Thank you for sharing!

      Delete
  2. Hi Erwin, when there are multiple optimal solutions, if we want the solution such that k is as close as to the middle, how to model it?

    ReplyDelete
    Replies
    1. That makes it a multiple objective problem. Standard approaches can be used for this (scalarization through weighted objective, lexicographic).

      Delete