Loading [MathJax]/jax/output/CommonHTML/jax.js

Tuesday, January 30, 2024

One nonzero in set of free variables

In [1] the following question is posed:

I have free variables xi. How can I impose the constraint that at least one of the variables is nonzero: xi0.


Observations and formulations


  • xi0 is somewhat difficult. We need to write this as xiϵorxiϵ Here ϵ>0 is a small constant. This "or"-type of constraint needs a binary variable (or something similar). It can't be modeled in a pure LP setting. (We have some non-convexity here).
  • The constant ϵ should not be too small, as the solver will use some feasibility tolerances. These tolerances are, in general, absolute (i.e. not relative). As the solver typically applies some scaling, this tolerance may be even larger than you think (in terms of the original constraints). And who knows what the impact of preprocessing is.  I sometimes see modelers using using something like ϵ=1e-6. This looks too small for me. I typically use something like 0.001 or 0.0001. Of course, this depends on the order of magnitudes of the values that xi can assume. 
  • We can split each xi into three regions: <0,=0,>0. An intuitive way to write this down is using indicator constraintsδi,1=1xiϵδi,2=1ϵxiϵδi,3=1xiϵrδi,r=1ii(δi,1+δi,3)1δi,r{0,1} The middle implication usually needs to be implemented as two constraints. Note that at the points xi=±ϵ we have a bit of ambiguity. In practice, that is not a bad thing: let the solver determine the most profitable region for those cases. The last constraint can also be stated as iδi,2n1 where n is the size of set iI, i.e., n=|I|.
  • Unfortunately, not all modeling tools and solvers support indicator constraints. They really should, as this is a useful modeling concept. If we have good, tight bounds xi[Li,Ui], with Li<0, then we can use binary variables. Here is a possible formulation:  xiϵδi,1+Ui(1δi,1)xiϵδi,2+Li(1δi,2)xiϵδi,2+Ui(1δi,2)xiϵδi,3+Li(1δi,3)rδi,r=1ii(δi,1+δi,3)1δi,r{0,1}
  • If we don't have good bounds on xi and no indicator constraints, we can use so-called Special Ordered Sets of Type 1 (a.k.a. SOS1 sets). A high-level model can look like: |xi|ϵγiiγi1γi{0,1} The non-convex absolute value constraint can be modeled as: y+iyi=xiy+i+yiϵγiy+i,yi0y+i,yiSOS1 I.e., we have n=|I| SOS1 sets with two members each. The SOS1 sets ensures that only one of y+i,yi can be nonzero. This construct does the same thing as a complementarity condition y+iyi=0 We can also simulate indicator constraints directly using SOS1 sets.
  • In [1], I suggested somewhat of a hack. I should not have mentioned that. Let me explain. Instead of checking each xi, we can try to do y=ixi Now we just require: y0. That will force some xi0. Note that the reverse is not true: if  y=0, we not necessarily have that all xi=0. E.g. consider: x1=1x2=+1xi=0i3 This solution would be cut off as y=ixi=0 while it obeys our original requirement of at least one xi0. Clearly, no free lunch here. The only reason to use this formulation is when using discrete variables for each xi is prohibitively expensive.


Numerical experiment


Let's try out these 4 formulations with the following objective: minz=ix2i Hopefully, just one xi will be nonzero.

Model 1 with indicator constraints does the right thing:


----     90 **** model 1: indicator constraints ****
            PARAMETER epsilon              =        0.001  larger than feasibility tolerance

----     90 VARIABLE x.L  free variables

i3 0.001


----     90 VARIABLE delta.L  binary indicator variable (indicates region)

        region2     region3

i1        1.000
i2        1.000
i3                    1.000
i4        1.000
i5        1.000
i6        1.000
i7        1.000
i8        1.000
i9        1.000
i10       1.000


----     90 VARIABLE z.L                   =  1.000000E-6  objective variable

The main issue with the implementation is that GAMS does not support indicator constraints very well. It is a bit of a half-baked functionality. Firstly, a solver option file should not change the meaning of a model. Secondly, the option file reader is too primitive to handle practical cases without using some hacks. If you take integer programming seriously, you really must take indicator constraints seriously. This will be illustrated below.

The second model with binary variables is interesting. We get incorrect results depending on the exact value of ϵ. We try the model with ϵ=0.001,0.01,0.1, and fixed bounds 1000xi1000. Only for ϵ=0.1 do we get correct results! I did not expect this: I thought my values for the bounds and for ϵ were sufficiently conservative.


----    120 **** model 2: binary variables ****
            PARAMETER epsilon              =        0.001  larger than feasibility tolerance

----    120 VARIABLE x.L  free variables

                      ( ALL       0.000 )


----    120 VARIABLE delta.L  binary indicator variable (indicates region)

        region1     region2     region3

i1  9.999990E-7                   1.000
i2        1.000 9.999990E-7
i3  9.999990E-7                   1.000
i4  9.999990E-7                   1.000
i5  9.999990E-7                   1.000
i6  9.999990E-7                   1.000
i7  9.999990E-7                   1.000
i8  9.999990E-7                   1.000
i9  9.999990E-7                   1.000
i10 9.999990E-7                   1.000


----    120 VARIABLE z.L                   =        0.000  objective variable


======================================

----    124 **** model 2: binary variables ****
            PARAMETER epsilon              =        0.010  larger than feasibility tolerance

----    124 VARIABLE x.L  free variables

                      ( ALL       0.000 )


----    124 VARIABLE delta.L  binary indicator variable (indicates region)

        region1     region3

i1  9.999900E-6       1.000
i2  9.999900E-6       1.000
i3  9.999900E-6       1.000
i4  9.999900E-6       1.000
i5  9.999900E-6       1.000
i6  9.999900E-6       1.000
i7  9.999900E-6       1.000
i8  9.999900E-6       1.000
i9  9.999900E-6       1.000
i10 9.999900E-6       1.000


----    124 VARIABLE z.L                   =        0.000  objective variable

======================================


----    128 **** model 2: binary variables ****
            PARAMETER epsilon              =        0.100  larger than feasibility tolerance

----    128 VARIABLE x.L  free variables

i10 0.100


----    128 VARIABLE delta.L  binary indicator variable (indicates region)

        region2     region3

i1        1.000
i2        1.000
i3        1.000
i4        1.000
i5        1.000
i6        1.000
i7        1.000
i8        1.000
i9        1.000
i10                   1.000


----    128 VARIABLE z.L                   =        0.010  objective variable

The bounds are quite reasonable. But nevertheless, we observe numerical issues. Very small values in the binary variable δi,r cause feasible solutions that actually don't make sense. We can play with the (feasibility) tolerances or with the bounds. Here, I just adjusted the size of ϵ. We get proper solutions only after increasing it to 0.1 (!!!). 

This is a somewhat extreme example of how Big-M constraints with binary variables can cause serious problems. However, indicator constraints can really help. They are not only a tool to help modeling but also can improve numerics. 

The SOS1 model is also working correctly:

----    157 **** model 3: SOS1 sets ****
            PARAMETER epsilon              =        0.001  larger than feasibility tolerance

----    157 VARIABLE x.L  free variables

i10 0.001


----    157 VARIABLE y.L  variable splitting

           plus

i10       0.001


----    157 VARIABLE gamma.L  gamma(i)=1 ==> |x(i)|>=epsilon

i10 1.000


----    157 VARIABLE z.L                   =  1.000000E-6  objective variable


Finally, our flawed method 4 also gives a somewhat strange solution:

----    177 **** model 4: single SOS1 set ****
            PARAMETER epsilon              =        0.001  larger than feasibility tolerance

----    177 VARIABLE x.L  free variables

i1  1.000192E-4,    i2  1.000192E-4,    i3  1.000192E-4,    i4  1.000192E-4,    i5  1.000192E-4,    i6  1.000192E-4
i7  1.000192E-4,    i8  1.000192E-4,    i9  1.000192E-4,    i10 1.000192E-4


----    177 VARIABLE y2.L  

plus 0.001


----    177 VARIABLE z.L                   =  1.000384E-7  objective variable

Here the deviation 0.001 is distributed over the xi. That makes sense, as it makes the objective smaller.

Conclusion


  • Indicator constraints have two important advantages:
    • they allow for more natural, intuitive formulations
    • and they can be numerically more reliable than big-M constraints using binary variables.
  • GAMS does not do indicator variables very well.
  • This example shows how sensitive big-M constraints can be. It should scare the xxxx out of you.
  • If your solver does not support indicator variables but knows about SOS1 sets, formulations using SOS1 sets can help. However, check the performance: binary variables can be faster.

References


Appendix: GAMS model


option miqcp = cplex;

 

*------------------------------------------------

* x(i) are free variables

*------------------------------------------------

 

 

set

  i /i1*i10/

  'regions' /

   region1  'x(i) <= -epsilon'

   region2  '-epsilon <= x(i) <= epsilon'

   region3  'x(i) >= epsilon'

   /

;

 

scalar epsilon 'larger than feasibility tolerance' /0.001/;

 

variables

   z     'objective variable'

   x(i)  'free variables'

;

 

equation obj;

obj.. z =e= sum(i,sqr(x(i)));

 

*------------------------------------------------

* 1. indicator constraints

* The approach in GAMS is hardly fit for human

consumption.

*------------------------------------------------

  

binary variable

  delta(i,r'binary indicator variable (indicates region)'

;

 

equations

   indic1(i)  "delta(i,'region1')=1 ==> x(i) <= -epsilon"

   indic2a(i"delta(i,'region2')=1 ==> x(i) >= -epsilon"

   indic2b(i"delta(i,'region2')=1 ==> x(i) <= epsilon"

   indic3(i)  "delta(i,'region3')=1 ==> x(i) >= epsilon"

   sumdelta(i"one region is active"

   atleastone  "at least one nonzero region is active"

;

indic1(i)..  x(i) =l= -epsilon; 

indic2a(i).. x(i) =g= -epsilon;

indic2b(i).. x(i) =l= epsilon;

indic3(i)..  x(i) =g= epsilon;

sumdelta(i).. sum(r,delta(i,r)) =e= 1;

atleastone.. sum(i,delta(i,'region1')+delta(i,'region3')) =g= 1;

the last constraint can also be written as:

atleastone.. sum(i,delta(i,'region2')) =l= card(i)-1;

 

 

this does not work

$onecho > cplex.opt

indic indic1(i)$delta(i,'region1')  1

indic indic2a(i)$delta(i,'region2') 1

indic indic2b(i)$delta(i,'region2') 1

indic indic3(i)$delta(i,'region3')  1

$offecho

 

ugly workaround

file opt /cplex.opt/;

put opt;

loop(i,

  put 'indic ' indic1.tn(i'$' delta.tn(i,"region1"' 1'/;

  put 'indic ' indic2a.tn(i'$' delta.tn(i,"region2"' 1'/;

  put 'indic ' indic2b.tn(i'$' delta.tn(i,"region2"' 1'/;

  put 'indic ' indic3.tn(i'$' delta.tn(i,"region3"' 1'/;

);

putclose;

 

 

model m1 /all/;

m1.optfile=1;

solve m1 minimizing z using miqcp;

display "**** model 1: indicator constraints ****",epsilon,x.l,delta.l,z.l;

 

 

*------------------------------------------------

* 2. Binary variables

* Epsilon is too small.

*------------------------------------------------

 

Scalars

   LO /-1000/

   UP /1000/

;

 

x.lo(i) = LO;

x.up(i) = UP;

 

equations

  bound1(i)

  bound2a(i)

  bound2b(i)

  bound3(i)

;

 

bound1(i)..  x(i) =l= -epsilon*delta(i,'region1') + x.up(i)*(1-delta(i,'region1'));

bound2a(i).. x(i) =g= -epsilon*delta(i,'region2') + x.lo(i)*(1-delta(i,'region2'));

bound2b(i).. x(i) =l= epsilon*delta(i,'region2') + x.up(i)*(1-delta(i,'region2'));

bound3(i)..  x(i) =g= epsilon*delta(i,'region3') + x.lo(i)*(1-delta(i,'region3'));

 

model m2 /obj,bound1,bound2a,bound2b,bound3,sumdelta,atleastone/;

solve m2 minimizing z using miqcp;

display "**** model 2: binary variables ****",epsilon,x.l,delta.l,z.l;

 

epsilon = 0.01;

solve m2 minimizing z using miqcp;

display "**** model 2: binary variables ****",epsilon,x.l,delta.l,z.l;

 

epsilon = 0.1;

solve m2 minimizing z using miqcp;

display "**** model 2: binary variables ****",epsilon,x.l,delta.l,z.l;

 

reset to its original value

epsilon = 0.001;

 

*------------------------------------------------

* 3. SOS1 variables

*------------------------------------------------

 

set

  pm 'for variable splitting' /plus,min/

;

 

binary variable gamma(i'gamma(i)=1 ==> |x(i)|>=epsilon';

sos1 variable y(i,pm'variable splitting';

automatically y>=0

 

equations

   def_y(i)

   abs_y(i)

   at_least_one2

;

  

def_y(i).. y(i,'plus')-y(i,'min') =e= x(i);

abs_y(i).. y(i,'plus')+y(i,'min') =g= epsilon*gamma(i);

at_least_one2.. sum(igamma(i)) =g= 1;

 

model m3 /obj,def_y,abs_y,at_least_one2/;

solve m3 minimizing z using miqcp;

display "**** model 3: SOS1 sets ****",epsilon,x.l,y.l,gamma.l,z.l;

 

*------------------------------------------------

* 4. One SOS1 Set

* This is a hack: it cuts off solutions.

*------------------------------------------------

 

 

sos1 variable y2(pm);

automatically y2>=0

 

equations

  def_y2

  abs_y2

;

def_y2.. y2('plus')-y2('min') =e= sum(i,x(i));

abs_y2.. y2('plus')+y2('min') =g= epsilon;

 

model m4 /obj,def_y2,abs_y2/;

solve m4 minimizing z using miqcp;

display "**** model 4: single SOS1 set ****",epsilon,x.l,y2.l,z.l;

 

 

1 comment:

  1. Your second model can be strengthened by replacing the four big-M constraints with L[i] * delta[i,1] - epsilon * delta[i,2] + epsilon * delta[i,3] <= x[i] <= -epsilon * delta[i,1] + epsilon * delta[i,2] + U[i] * delta[i,3]

    ReplyDelete