Loading [MathJax]/jax/output/CommonHTML/jax.js

Friday, February 22, 2019

Piecewise linear functions and formulations for interpolation (part 2)

Pyomo supports a number of methods to perform piecewise linear interpolation. In part 1 [1], the following methods for piecewise linear interpolation were discussed:

  1. SOS2: use SOS2 variables to model the piecewise linear functions. This is an easy modeling exercise.
  2. BIGM_BIN: use binary variables and big-M constraints to enable and disable constraints. This is a more complex undertaking, especially if we want to use the smallest possible values for the big-M constants. Pyomo has bug in this version.
  3. BIGM_SOS1: a slight variation of the BIGM_BIN model. Pyomo has the same problem with this model.
Here I show a few more formulations:

  1. DCC: Disaggregated Convex Combination formulation is a simulation of the SOS2 model by binary variables. 
  2. CC:  Convex Combination formulation is a similar SOS2 like approach using binary variables only.
  3. MC: Multiple Choice model uses a semi-continuous approach.
  4. INCR: Incremental formulation is using all previous segments.
I'll state the formulation in mathematical notation, hopefully a little bit more accessible than in some papers I looked at, and illustrate the formulation with GAMS code to make it more concrete.

Note that part 3 [2] will discuss some newer formulations: LOG and DLOG that only need a logarithmic number of binary variables.

For demonstration purposes we use the same small example with 4 breakpoints and 3 segments.



Method 4: DCC - Disaggregated Convex Combination Model


This is an impressive name for a model that is actually not very complicated. 

First we define a binary variable for each segment: δs={1if segment s is selected0otherwise Then we perform interpolation on the selected segment. For this we define a weight variable: λs,k0 for the two end points belonging to segment s.  This way we achieve the "only two neighboring weights λ can be nonzero" constraint from our SOS2 model. The mathematical model can look like:


DCC - Disaggregated Convex Combination Formulation
x=s,k|SK(s,k)ˉxkλs,ky=s,k|SK(s,k)ˉykλs,kδs=k|SK(s,k)λs,ksδs=1λs,k0δs{0,1}

Here the mapping set SK(s,k) is TRUE for the two breakpoints k belonging to segment s.

We can also write the model more directly, without the mapping SK(s,k):


DCC - Disaggregated Convex Combination Formulation 2
x=s(ˉxsλs,s+ˉxs+1λs,s+1)y=s(ˉysλs,s+ˉys+1λs,s+1)δs=λs,s+λs,s+1sδs=1λs,k0δs{0,1}



Here is the GAMS version:

set
  k
'breakpoints' /point1*point4/
  s
'segments' /segment1*segment3/
  sk(s,k)
'mapping'
;
sk(s,k) =
ord(s)=ord(k) or ord(s)=ord(k)-1;
display sk;

table data(k,*)
          
x   y
 
point1   1   6
 
point2   3   2
 
point3   6   8
 
point4  10   7
;


positive variable lambda(s,k) 'interpolation';
binary variable delta(s) 'select segment';
variable x,y;

equations
   xdef      
'x'
   ydef      
'y'
   link(s)   
'link delta-lambda'
   sumdelta  
'select segment'
;

xdef.. x =e=
sum(sk(s,k), lambda(s,k)*data(k,'x'));
ydef.. y =e=
sum(sk(s,k), lambda(s,k)*data(k,'y'));
link(s).. delta(s) =e=
sum(sk(s,k),lambda(s,k));
sumdelta..
sum(s, delta(s)) =e= 1;

x.fx = 5;

model m /all/;
option optcr=0;
solve m maximizing y using mip;
display x.l,y.l,delta.l,lambda.l;


In this model I made a strong distinction between segments s and breakpoints k. This will help the GAMS model to perform domain checking (type checking), so we get a bit better protection against errors in the model. Of course, if you prefer you just can indicate segments by 1,,K1.

Note that the linking constraints link, perform two functions. First: they make sure that for unselected segments (with δs=0), we have λs,k=0.  Second, for the selected segment, we automatically sum the λ's to 1.

The output of the model looks like:


----     13 SET sk  mapping

              point1      point2      point3      point4

segment1         YES         YES
segment2                     YES         YES
segment3                                 YES         YES

----     45 VARIABLE x.L                   =        5.000  
            VARIABLE y.L                   =        6.000  

----     45 VARIABLE delta.L  

segment2 1.000


----     45 VARIABLE lambda.L  

              point2      point3

segment2       0.333       0.667

The display of the set SK confirms we setup the topography correctly: segment k uses points k,k+1.

We fixed x=5 which yields y=6. We see δ2=1 so the second segment is selected, and this is also visible from the variables λs,k indicating we interpolate between points 2 and 3.

We can have a look at the generated equations:


---- xdef  =E=  

xdef..  - lambda(segment1,point1) - 3*lambda(segment1,point2) - 3*lambda(segment2,point2) - 6*lambda(segment2,point3)
     
      - 6*lambda(segment3,point3) - 10*lambda(segment3,point4) + x =E= 0 ; (LHS = 5, INFES = 5 ****)
     

---- ydef  =E=  

ydef..  - 6*lambda(segment1,point1) - 2*lambda(segment1,point2) - 2*lambda(segment2,point2) - 8*lambda(segment2,point3)
     
      - 8*lambda(segment3,point3) - 7*lambda(segment3,point4) + y =E= 0 ; (LHS = 0)
     

---- link  =E=  

link(segment1)..  - lambda(segment1,point1) - lambda(segment1,point2) + delta(segment1) =E= 0 ; (LHS = 0)
     
link(segment2)..  - lambda(segment2,point2) - lambda(segment2,point3) + delta(segment2) =E= 0 ; (LHS = 0)
     
link(segment3)..  - lambda(segment3,point3) - lambda(segment3,point4) + delta(segment3) =E= 0 ; (LHS = 0)
     

---- sumdelta  =E=  

sumdelta..  delta(segment1) + delta(segment2) + delta(segment3) =E= 1 ; (LHS = 0, INFES = 1 ****)

In a sense we simulated the SOS2 model using binary variables. More visible here is the two-step approach:

  1. Select the segment s using the binary variables δs.
  2. Interpolate between the breakpoints of this segment using the positive variables λs,k.
Of course in a MIP model we have simultaneous equations, so these things happen actually at the same time. The 2-step paradigm is more of a useful mental model.

For completeness, here is the GAMS model without set SK(s,k):

* DCC2: Disaggregated Convex Combination formulation 2
* without mapping SK(s,k)

set
  k
'breakpoints' /point1*point4/
  s(k)
'segments' /point1*point3/
;

table data(k,*)
          
x   y
 
point1   1   6
 
point2   3   2
 
point3   6   8
 
point4  10   7
;


positive variable lambda(s,k) 'interpolation';
binary variable delta(s) 'select segment';
variable x,y;

equations
   xdef      
'x'
   ydef      
'y'
   link(s)   
'link delta-lambda'
   sumdelta  
'select segment'
;

xdef.. x =e=
sum(s(k), lambda(s,k)*data(k,'x')+lambda(s,k+1)*data(k+1,'x'));
ydef.. y =e=
sum(s(k), lambda(s,k)*data(k,'y')+lambda(s,k+1)*data(k+1,'y'));
link(s(k)).. delta(s) =e= lambda(s,k)+lambda(s,k+1);
sumdelta..
sum(s, delta(s)) =e= 1;

x.fx = 5;

model m /all/;
option optcr=0;
solve m maximizing y using mip;
display x.l,y.l,delta.l,lambda.l;


Method 5: CC - Convex Combination Model


This formulation is very similar to the previous one. The main difference is the structure of the λ variables. We index them by k only. This is more like the SOS2 model.  The manner in which we link λk to δs becomes somewhat different. Instead of an equality we use an inequality λks|SK(s,k)δs This make sure that only for a selected segment with δs=1, the corresponding λk's can be nonzero. We need to add explicitly that kλk=1 as this is no longer implied by the linking constraints. The model looks like:


CC - Convex Combination Formulation
x=kˉxkλky=kˉykλkkλk=1λks|SK(s,k)δssδs=1λk0δs{0,1}


This model has fewer continuous variables than the DCC model. The GAMS version looks like:

set
  k
'breakpoints' /point1*point4/
  s
'segments' /segment1*segment3/
  sk(s,k)
'mapping'
;
sk(s,k) =
ord(s)=ord(k) or ord(s)=ord(k)-1;
display sk;

table data(k,*)
          
x   y
 
point1   1   6
 
point2   3   2
 
point3   6   8
 
point4  10   7
;


positive variable lambda(k) 'interpolation';
binary variable delta(s) 'select segment';
variable x,y;

equations
   xdef      
'x'
   ydef      
'y'
   link(k)   
'link delta-lambda'
   sumlambda 
'interpolation'
   sumdelta  
'select segment'
;

xdef.. x =e=
sum(k, lambda(k)*data(k,'x'));
ydef.. y =e=
sum(k, lambda(k)*data(k,'y'));
link(k).. lambda(k) =l=
sum(sk(s,k),delta(s));
sumlambda..
sum(k, lambda(k)) =e= 1;
sumdelta..
sum(s, delta(s)) =e= 1;

x.fx = 5;

model m /all/;
option optcr=0;
solve m maximizing y using mip;
display x.l,y.l,delta.l,lambda.l;

The output (including the generated equations) is:


Generated equations

---- xdef  =E=  x

xdef..  - lambda(point1) - 3*lambda(point2) - 6*lambda(point3) - 10*lambda(point4) + x =E= 0 ; (LHS = 5, INFES = 5 ****)
     

---- ydef  =E=  y

ydef..  - 6*lambda(point1) - 2*lambda(point2) - 8*lambda(point3) - 7*lambda(point4) + y =E= 0 ; (LHS = 0)
     

---- link  =L=  link delta-lambda

link(point1)..  lambda(point1) - delta(segment1) =L= 0 ; (LHS = 0)
     
link(point2)..  lambda(point2) - delta(segment1) - delta(segment2) =L= 0 ; (LHS = 0)
     
link(point3)..  lambda(point3) - delta(segment2) - delta(segment3) =L= 0 ; (LHS = 0)
     
link(point4)..  lambda(point4) - delta(segment3) =L= 0 ; (LHS = 0)
     
Solution
---- sumlambda  =E=  interpolation

sumlambda..  lambda(point1) + lambda(point2) + lambda(point3) + lambda(point4) =E= 1 ; (LHS = 0, INFES = 1 ****)
     

---- sumdelta  =E=  select segment

sumdelta..  delta(segment1) + delta(segment2) + delta(segment3) =E= 1 ; (LHS = 0, INFES = 1 ****)


----     49 VARIABLE x.L                   =        5.000  
            VARIABLE y.L                   =        6.000  

----     49 VARIABLE delta.L  select segment

segment2 1.000


----     49 VARIABLE lambda.L  interpolation

point2 0.333,    point3 0.667



The methods DCC and CC simulate our SOS2 model from [1] using binary variables. This means the model can handle step functions (we don't need to form a slope). The models are not too difficult to setup, as is illustrated by the GAMS models implementing them.

An early reference to this method is [4].

Method 6: MC - Multiple Choice formulation


This is a well-known method. We assume we can calculate slopes and intercepts for each segment. I.e. we have y=as+bsx if ˉxsxˉxs+1  with as=ˉys+1ˉysˉxs+1ˉxsbs=ˉysasˉxs

For the model we introduce binary variables δs to indicate which segment is selected, and so-called semi-continuous variables vs0[ˉxs,ˉxs+1]. The variable vs can be modeled as ˉxsδsvsˉxs+1δs

With this we can formulate the complete model:


MC - Multiple Choice Formulation
x=svsy=s(asvs+bsδs)ˉxsδsvsˉxs+1δssδs=1δs{0,1}as=ˉys+1ˉysˉxs+1ˉxsbs=ˉysasˉxs


Notes:

  • The sandwich equation  ˉxsδsvsˉxs+1δs must likely be implemented as two inequalities.
  • This construct says: vs=0 or vs[ˉxs,ˉxs+1]. vs is sometimes called semi-continuous.
  • The variables δs and vs are connected: δs=0vs=0δs=1vs=x I.e. they operate in parallel.
  • I have seen cases where this model outperformed the SOS2 formulation.

The GAMS model is simple:

set
  k
'breakpoints' /k1*k4/
  s(k)
'segments' /k1*k3/
;

table data(k,*)
      
x   y
 
k1   1   6
 
k2   3   2
 
k3   6   8
 
k4  10   7
;

data(s(k),
'dx') = data(k+1,'x')-data(k,'x');
data(s(k),
'dy') = data(k+1,'y')-data(k,'y');
data(s(k),
'slope') = data(k,'dy')/data(k,'dx');
data(s(k),
'intercept') = data(k,'y')-data(k,'slope')*data(k,'x');
display data;

variable v(s) 'equal to x or 0';
binary variable delta(s) 'select segment';
variable x,y;

equations
   xdef      
'x'
   ydef      
'y'
   semicont1(k)
   semicont2(k)
   sumdelta  
'select segment'
;

xdef.. x =e=
sum(s, v(s));
ydef.. y =e=
sum(s, data(s,'slope')*v(s)+data(s,'intercept')*delta(s));
semicont1(s(k)).. v(s) =l= data(k+1,
'x')*delta(s);
semicont2(s)..    v(s) =g= data(s,
'y')*delta(s);
sumdelta..
sum(s, delta(s)) =e= 1;

x.fx = 5;

model m /all/;
option optcr=0;
solve m maximizing y using mip;
display data,x.l,y.l,delta.l,v.l;




The results look like:


----     43 PARAMETER data  

             x           y          dx          dy       slope   intercept

k1       1.000       6.000       2.000      -4.000      -2.000       8.000
k2       3.000       2.000       3.000       6.000       2.000      -4.000
k3       6.000       8.000       4.000      -1.000      -0.250       9.500
k4      10.000       7.000


----     43 VARIABLE x.L                   =        5.000  
            VARIABLE y.L                   =        6.000  

----     43 VARIABLE delta.L  select segment

k2 1.000


----     43 VARIABLE v.L  equal to x or 0

k2 5.000


Segment 2 is selected. We can see that directly from the binary variable δs, but also from the semi-continuous variable vs.

Method 7: INC - Incremental Formulation


In the incremental or delta method we add up all contributions of "earlier" segments, to find our x and y.

Let s be the segment that contains our current x. We define a binary variable δs as δs={1for s<s0for ss In addition we use a continuous variable λs[0,1] indicating how much of each segment we "use up". The contribution for earlier segments is 1 and for the current segment we have a fractional value. So we have: {λs=1for s<sλs[0,1]for s=sλs=0for s>s With these definitions we can write: x=ˉx1+sλs(ˉxs+1ˉxs)y=ˉy1+sλs(ˉys+1ˉys)

Now we need to formulate a structure that enforce these rules on δ and λ. The following will do that: λs+1δsλs Typically you will need to implement this as two separate constraints.

The complete model looks like:


INC - Incremental Formulation
x=ˉx1+sλs(ˉxs+1ˉxs)y=ˉy1+sλs(ˉys+1ˉys)λs+1δsλsδs{0,1}λs[0,1]

We can see that we can fix the last δs=0. The model will behave correctly without this, but we may help the presolver a bit with this.


The GAMS model can look like:

set
  k
'breakpoints' /k1*k4/
  s(k)
'segments' /k1*k3/
;

table data(k,*)
      
x   y
 
k1   1   6
 
k2   3   2
 
k3   6   8
 
k4  10   7
;

positive variable lambda(s) 'contribution factor of segment';
lambda.up(s) = 1;
binary variable delta(s) 'previously contributing segments';
variable x,y;

equations
   xdef      
'x'
   ydef      
'y'
   link1(s)  
'links lambda delta'
   link2(s)  
'links lambda delta'
;

* fix last delta(s) to 0.
* this is not strictly needed
delta.fx(s)$(
ord(s)=card(s)) = 0;

xdef.. x =e= data(
'k1','x')+ sum(s(k), lambda(s)*(data(k+1,'x')-data(k,'x')));
ydef.. y =e= data(
'k1','y')+ sum(s(k), lambda(s)*(data(k+1,'y')-data(k,'y')));
link1(s)..  lambda(s+1) =l= delta(s);
link2(s)..  delta(s) =l= lambda(s);

x.fx = 5;

model m /all/;
option optcr=0;
solve m maximizing y using mip;
display x.l,y.l,delta.l,lambda.l;



Note that in equation link1, we go one position too far when addressing λs+1 for the last s. GAMS will make that reference zero, and that is correct for this case. We see in the equation listing:


---- link1  =L=  links lambda delta

link1(k1)..  lambda(k2) - delta(k1) =L= 0 ; (LHS = 0)
     
link1(k2)..  lambda(k3) - delta(k2) =L= 0 ; (LHS = 0)
     
link1(k3)..  - delta(k3) =L= 0 ; (LHS = 0)

The solution is


----     36 VARIABLE x.L                   =        5.000  
            VARIABLE y.L                   =        6.000  

----     36 VARIABLE delta.L  previously contributing segments

k1 1.000


----     36 VARIABLE lambda.L  contribution factor of segment

k1 1.000,    k2 0.667


Notes:
  • A slightly different formulation is the following. Use the definition: δs={1for ss0for s>s Now we use the sandwich equation: δs+1λsδs With this formulation we can fix: δ1=1.
  • This approach goes back to [5].


References


  1. Piecewise linear functions and formulations for interpolation (part 1). This post discusses formulations SOS2, BIGM_BIN and BIGM_SOS1. The SOS2 formulation is the most important. Also illustrated how a Pyomo implementation is not implementing BIGM_BIN and BIGM_SOS1 correctly.  http://yetanothermathprogrammingconsultant.blogspot.com/2019/02/piecewise-linear-functions-and.html
  2. Piecewise linear functions and formulations for interpolation (part 3). This post describes the LOG and DLOG formulations, which have a logarithmic number of binary variables. We see also some issues with the Pyomo implementation. http://yetanothermathprogrammingconsultant.blogspot.com/2019/03/piecewise-linear-functions-and.html
  3. Keely L. Croxton, Bernard Gendron, Thomas L. Magnanti, A Comparison of Mixed-Integer Programming Models for Non-Convex Piecewise Linear Cost Minimization Problems, Management Science, Volume 49, Issue 9, 2003, pages 1121-1273
  4. G. Dantzig, On the significance of solving linear programming problems with some integer variables, Econometrica 28, 30-44, 1960
  5. H. Markowitz, A. Manne, On the solution of discrete programming problems, Econometrica 25, 84-110, 1957

Sunday, February 17, 2019

Piecewise linear functions and formulations for interpolation (part 1)

Pyomo has a tool to use different formulations for piecewise linear functions. From the help:

>>> from pyomo.core import *
>>> help(Piecewise)
Help on class Piecewise in module pyomo.core.base.piecewise:

class Piecewise(pyomo.core.base.block.Block)
 |  Piecewise(*args, **kwds)
 |
 |      Adds piecewise constraints to a Pyomo model for functions of the
 |      form, y = f(x).
 |
 |      Usage:
 |              model.const = Piecewise(index_1,...,index_n,yvar,xvar,**Keywords)
 |              model.const = Piecewise(yvar,xvar,**Keywords)
 |
 |      Keywords:
 |
 |  -pw_pts={},[],()
 |            A dictionary of lists (keys are index set) or a single list
 |            (for the non-indexed case or when an identical set of
 |            breakpoints is used across all indices) defining the set of
 |            domain breakpoints for the piecewise linear
 |            function. **ALWAYS REQUIRED**
 |
 |  -pw_repn=''
 |            Indicates the type of piecewise representation to use. This
 |            can have a major impact on solver performance.
 |            Choices: (Default 'SOS2')
 |
 |               ~ + 'SOS2'      - Standard representation using sos2 constraints
 |               ~   'BIGM_BIN'  - BigM constraints with binary variables.
 |                                 Theoretically tightest M values are automatically
 |                                 determined.
 |               ~   'BIGM_SOS1' - BigM constraints with sos1 variables.
 |                                 Theoretically tightest M values are automatically
 |                                 determined.
 |               ~*+ 'DCC'       - Disaggregated convex combination model
 |               ~*+ 'DLOG'      - Logarithmic disaggregated convex combination model
 |               ~*+ 'CC'        - Convex combination model
 |               ~*+ 'LOG'       - Logarithmic branching convex combination
 |               ~*  'MC'        - Multiple choice model
 |               ~*+ 'INC'       - Incremental (delta) method
 |
 |             + Supports step functions
 |             * Source: "Mixed-Integer Models for Non-separable Piecewise Linear
 |                        Optimization: Unifying framework and Extensions" (Vielma,
 |                        Nemhauser 2008)
 |             ~ Refer to the optional 'force_pw' keyword.
 |
 |  -pw_constr_type=''
 |            Indicates the bound type of the piecewise function.
 |            Choices:
-- More  --

Some of these representations are not immediately obvious. Let's try to get a intuitive understanding of them. For a more formal discussion see [3].

I just consider the case I am using most often: a one-dimensional piecewise function y=f(x). Note that in the context of an optimization model we need not necessarily to interpret this as: given an x, calculate an y,  but rather as part of a system of equations. I.e. x and y are determined at same time (if y is known, we find x, or even: we find x,y simultaneously).

The piecewise linear function is defined by its breakpoints:


Some observations:

  1. We have K breakpoints (ˉxk,ˉyk) and K1 segments,
  2. we assume ˉxk is ordered: ˉxk+1ˉxk,  
  3. we assume we have lower and upper bounds on x and y,
  4. we assume the curve is connected (i.e. no forbidden regions for x, these have to be handled separately,
  5. we allow step functions by adding a vertical segment (not all formulation allow this).

Finally, we note that interpolating between two points (ˉx1,ˉy1), (ˉx2,ˉy2) can be stated as choosing two weights λ1,λ2 with: x=λ1ˉx1+λ2ˉx2y=λ1ˉy1+λ2ˉy2λ1+λ2=1λ1,λ20



Method 1: SOS2 formulation


This one is the easiest to remember. Many more advanced MIP solvers have facilities for SOS2 variables [4]. The definition is: 

Let x1,,xn be members of a Special Ordered Set of Type 2 (SOS2), then only two variables xi can be nonzero and they have to be neighbors. All other variables are zero.
This looks like a strange animal, but in fact this construct is especially designed for cases like our piecewise linear interpolation scheme. Using SOS2 constraints, we can easily formulate a model: 


SOS2 Formulation
x=kˉxkλky=kˉykλkkλk=1λk0SOS2(λ1,,λK)

Note that x and y are decision variables, while ˉxk and ˉyk are constants. The color coding helps to distinguish variables from data.

In the solution, two adjacent λs,λs+1 will be between 0 and 1, and for these we have λs+λs+1=1. All the others will be zero. I.e. we will be interpolating between the two corresponding breakpoints (ˉxs,ˉys) and (ˉxs+1,ˉys+1).  This SOS2 formulation does two things in one swoop: it selects the segment s the variables x and y belong to, and it interpolates between the two selected breakpoints.

Notes:

  • This method is used quite a lot in practice, not in the least because it makes life easy for the modeler. It is useful to have this formulation in your toolbox.
  • This approach handles step functions without a problem.
  • There are no issues with big-M constants: there are none.
  • However, binary variables may offer performance benefits. 

The Pyomo model is as follows:


#
# expected solution X=5, Y=6
#

xdata = [1., 3., 6., 10.]
ydata  = [6.,2.,8.,7.]

from pyomo.core import *

model = ConcreteModel()

model.X = Var(bounds=(1,10))
model.Y = Var(bounds=(0,100))

model.con = Piecewise(model.Y,model.X,
                      pw_pts=xdata,
                      pw_constr_type='EQ',
                      f_rule=ydata,
                      pw_repn='SOS2')

# see what we get for Y when X=5
def con2_rule(model):
    return model.X==5

model.con2 = Constraint(rule=con2_rule)

model.obj = Objective(expr=model.Y, sense=maximize)

It is also instructive to see how this can be modeled in a more traditional modeling language like GAMS:

set
  k
'breakpoints' /point1*point4/
;
table data(k,*)
          
x   y
 
point1   1   6
 
point2   3   2
 
point3   6   8
 
point4  10   7
;

sos2 variable lambda(k) 'sos2 variable';
variable x,y;
equations
   refrow   
'reference row'
   funrow   
'function row'
   convexity
;

refrow.. x =e=
sum(k, lambda(k)*data(k,'x'));
funrow.. y =e=
sum(k, lambda(k)*data(k,'y'));
convexity..
sum(k, lambda(k)) =e= 1;

x.fx = 5;

model m /all/;
option optcr=0;
solve m maximizing y using mip;
display x.l,y.l,lambda.l;

Results will look like:


----     29 VARIABLE x.L                   =        5.000  
            VARIABLE y.L                   =        6.000  

----     29 VARIABLE lambda.L  sos2 variable

point2 0.333,    point3 0.667

This is the correct solution when x=5.

Method 2: Big-M Binary Formulation


In this formulation we turn on or off each linear equality depending if we select a segment. Looking at our simple example, we can state the problem as: y={2x+8x[1,3]2x4x[3,6]0.25x+9.5x[6,10] We can model this "case statement" with binary variables as follows: define δs={1if segment s is selected0otherwise When δs=0 for a given segment s, we need to make the corresponding equation non-binding, and also the corresponding limits on x. This can be accomplished using inequalities and some big-M's. As an example, consider the second segment. We can write: 2x4(1δ2)My2x4+(1δ2)M3(1δ2)Mx6+(1δ2)M Finding good values for each M is a bit tedious. Basically we have to choose the smallest M such that each breakpoint is reachable (i.e. not cut off). For the constraint y2x4+(1δ2)M we can write: M=maxk{ˉyk2ˉxk+4}=max{621+4,223+4,826+4,7210+4}=max{8,0,0,9}=8 The zero values correspond to breakpoints on the current segment.

A high level description can look like: δs=1{y=asx+bsˉxsxˉxs+1sδs=1δs{0,1} where as,bs are the slope and intercept for the linear function in segment s. These coefficients can be calculated easily from the breakpoints (ˉxs,ˉys) and (ˉxs+1,ˉys+1). This implication can be implemented as follows: 


Big-M Binary Formulation
asx+bs(1δs)Myasx+bs+(1δs)Mˉxs(1δs)Mxˉxs+1+(1δs)Msδs=1δs{0,1}as=ˉys+1ˉysˉxs+1ˉxsbs=ˉysasˉxs


This looks a bit more intimidating, but it is conceptually simple. Unfortunately we see some problems with the Pyomo implementation of this formulation.

Notes:

  • We need 4 inequalities per segment, and one binary variable.
  • This method assumes we can calculate a slope as. For step functions we have an infinite slope when there is a vertical segment. I.e. we cannot do step functions using this method.
  • It is imperative to choose reasonable values for the big-M's.
  • In the model above I just used one M, but actually each of them can have a different value.
  • This is not a formulation I typically use: it is somewhat cumbersome to setup. 
  • Pyomo implements this method incorrectly, and we will get wrong results.

Method 3: Big-M SOS1 formulation


This is just a minor variant on the previous method. Instead of using binary variables, we use continuous variables that are members of a Special Ordered Set of Type 1. SOS1 sets are defined by:

Let x1,,xn be members of a Special Ordered Set of Type 1 (SOS1), then only one variable xi can be nonzero. All other variables are zero.

Basically we replace the binary variables: sδs=1δs{0,1} by sλs=1λs0SOS1(λ1,,λK1) This does not change the model a lot:  


Big-M SOS1 Formulation
asx+bs(1λs)Myasx+bs+(1λs)Mˉxs(1λs)Mxˉxs+1+(1δs)Msλs=1λs0SOS1(λ1,λK1)as=ˉys+1ˉysˉxs+1ˉxsbs=ˉysasˉxs

Notes:

  • This method shows the same Pyomo bug as the big-M binary variable model: it is incorrectly implemented. See the analysis below for what is wrong.
  • If a solver support SOS1 variables, it also will typically support SOS2 variables. The SOS2 approach may be easier to use.
  • This method is supposedly marked for removal in future versions of Pyomo. 
  • We can use the SOS1 variables to prevent the need for big-M's. We can write something like asx+bsλsyasx+bs+λsˉxsλsxˉxs+1+λssδs=1λs0δs{0,1}SOS1(λs,δs) If you look at this carefully, you can see this implements the implications: δs=1{y=asx+bsˉxsxˉxs+1sδs=1δs{0,1} This is similar to how a MIP solver may rewrite implications. This particular way of dealing with an implication is a useful concept to know.
  • I never have used the BIGM_SOS1 formulation in my models. It just does not look very appetizing.

A nasty Pyomo bug


To test the bigM-bin method on our simple data set I formulated the model: maxypiecewise on our datax=5x[1,10] We expect to see as solution: x=5,y=6 and an objective of 6. Instead we see:

D:\Python\Python37\Scripts>pyomo solve --solver=glpk bigm.py
[    0.00] Setting up Pyomo environment
[    0.00] Applying Pyomo preprocessing actions
WARNING: DEPRECATED: The 'BIGM_BIN' and 'BIGM_SOS1' piecewise representations
    will be removed in a future version of Pyomo. They produce incorrect
    results in certain cases
[    0.60] Creating model
[    0.60] Applying solver
[    0.83] Processing results
    Number of solutions: 1
    Solution Information
      Gap: 0.0
      Status: optimal
      Function Value: 8.25
    Solver results file: results.json
[    0.86] Applying Pyomo postprocessing actions
[    0.86] Pyomo Finished

D:\Python\Python37\Scripts>

Well, this is disappointing. We get a wrong solution!

We also get a warning that the Pyomo people are aware of problems with this method.

This is more wrong than what can be explained away with some numerical issues related to large big-M values. Large big-M values can lead to wrong results. I don't think that is the case here, but let's investigate.

The details of the Pyomo error


This section may become a bit dreary: we will dive into the details of this bug. Not for the faint of heart.

The model looks like the Pyomo model shown in the SOS2 section. The only difference is that we used a different method: BIGM_BIN instead of SOS2.

The solution file is:


{
    "Problem": [
        {
            "Lower bound": 8.25,
            "Name": "unknown",
            "Number of constraints": 9,
            "Number of nonzeros": 21,
            "Number of objectives": 1,
            "Number of variables": 6,
            "Sense": "maximize",
            "Upper bound": 8.25
        }
    ],
    "Solution": [
        {
            "number of solutions": 1,
            "number of solutions displayed": 1
        },
        {
            "Constraint": "No values",
            "Gap": 0.0,
            "Message": null,
            "Objective": {
                "obj": {
                    "Value": 8.25
                }
            },
            "Problem": {},
            "Status": "optimal",
            "Variable": {
                "X": {
                    "Value": 5.0
                },
                "Y": {
                    "Value": 8.25
                },
                "con.bin_y[3]": {
                    "Value": 1.0
                }
            }
        }
    ],
    "Solver": [
        {
            "Error rc": 0,
            "Statistics": {
                "Branch and bound": {
                    "Number of bounded subproblems": "1",
                    "Number of created subproblems": "1"
                }
            },
            "Status": "ok",
            "Termination condition": "optimal",
            "Time": 0.12992453575134277
        }
    ]
}

We see two problems:

  • Clearly the Y value is wrong: it should be 6 instead of 8.25. 
  • We see X=5 and con.bin_y[3]=1. The last value indicates we are in the third segment. But the third segment is [6,10] and does not include x=5.

To debug this, we can generate and inspect the LP file. This is a bit hard to read:


\* Source Pyomo model name=unknown *\

max 
obj:
+1 Y

s.t.

c_e_con2_:
+1 X
= 5

c_u_con_BIGM_constraint1(1)_:
+2 X
+1 Y
+19 con_bin_y(1)
<= 27

c_u_con_BIGM_constraint1(2)_:
-2 X
+1 Y
+8 con_bin_y(2)
<= 4

c_u_con_BIGM_constraint1(3)_:
+0.25 X
+1 Y
<= 9.5

c_e_con_BIGM_constraint2_:
+1 con_bin_y(1)
+1 con_bin_y(2)
+1 con_bin_y(3)
= 1

c_l_con_BIGM_constraint3(1)_:
+2 X
+1 Y
>= 8

c_u_con_BIGM_constraint3(2)_:
+2 X
-1 Y
+9 con_bin_y(2)
<= 13

c_u_con_BIGM_constraint3(3)_:
-0.25 X
-1 Y
+6.75 con_bin_y(3)
<= -2.75

c_e_ONE_VAR_CONSTANT: 
ONE_VAR_CONSTANT = 1.0

bounds
   1 <= X <= 10
    -inf <= Y <= +inf
   0 <= con_bin_y(1) <= 1
   0 <= con_bin_y(2) <= 1
   0 <= con_bin_y(3) <= 1
binary
  con_bin_y(1)
  con_bin_y(2)
  con_bin_y(3)
end


Let's compare carefully what we expect and what we see:

ConstraintExpectedGenerated by PyomoNotes
Objectivemaxy
max obj: +1 Y
Segment 1y2x+8+(1δ1)M
+2X+1Y+19con_bin_y(1)<=27
M=19
y2x+8(1δ1)M
+2X+1Y>=8
M=0
x1(1δ1)M This one is not needed.
x3+(1δ1)M M=7
Segment 2y2x4+(1δ2)M
-2X+1Y+8con_bin_y(2)<=4
M=8
y2x4(1δ2)M
+2X-1Y+9con_bin_y(2)<=13
M=9
x3(1δ2)M M=2
x6+(1δ2)M M=4
Segment 3y0.25x+9.5+(1δ3)M
+0.25X+1Y<=9.5
M=0
y0.25x+9.5(1δ3)M
0.25X-1Y+6.75con_bin_y(3)<=-2.75
M=6.75
x6(1δ3)M M=5
x10+(1δ3)M This one is not needed.
Sumsδs=1
+1con_bin_y(1)
+1con_bin_y(2)
+1con_bin_y(3)=1
Fixx=5
+1X=5

We see we have missing inequalities relating x to the segments. The cells marked in yellow indicate the missing constraints. No wonder we got the wrong results. The omission to include the condition δs=1ˉxsxˉxs+1 explains our erroneous solution: we are looking at the wrong segment.

We cannot blame the big-M's for this problem. The Pyomo piecewise function is just forgetting to include a number of inequalities.

This was an interesting exercise: debugging without looking at the code.

References


  1. Piecewise linear functions and formulations for interpolation (part 2). This post discusses formulations DCC, CC, MC and INCR. http://yetanothermathprogrammingconsultant.blogspot.com/2019/02/piecewise-linear-functions-and_22.html
  2. Piecewise linear functions and formulations for interpolation (part 3). This post describes the LOG and DLOG formulations, which have a logarithmic number of binary variables. We see also some issues with the Pyomo implementation. http://yetanothermathprogrammingconsultant.blogspot.com/2019/03/piecewise-linear-functions-and.html
  3. Juan Pablo Vielma, Shabbir Ahmed and George Nemhauser, Mixed-Integer Models for Nonseparable Piecewise Linear Optimization: Unifying Framework and Extensions, Operations Research 58(2):303-315, 2010
  4. E.M.L. Beale, J.J.H. Forrest, Global Optimization Using Special Ordered Sets, Math Prog 10, 52-69, 1976


Appendix: script to calculate big-M's in BIGM_INT and BIGM_SOS1 models


It is easy to make mistakes in the calculate of the big M values for the BIGM_INT and BIGM_SOS1 models. I used the following model to calculate them:

set
  k
'breakpoints' /k1*k4/
  s(k)
'segments' /k1*k3/
;

table data(k,*)
      
x   y
 
k1   1   6
 
k2   3   2
 
k3   6   8
 
k4  10   7
;

data(s(k),
'dx') = data(k+1,'x')-data(k,'x');
data(s(k),
'dy') = data(k+1,'y')-data(k,'y');
data(s(k),
'slope') = data(k,'dy')/data(k,'dx');
data(s(k),
'intercept') = data(k,'y')-data(k,'slope')*data(k,'x');
display data;

parameter bound(*,*) 'implied bounds on x and y';
bound(
'x','lo') = smin(k,data(k,'x'));
bound(
'x','up') = smax(k,data(k,'x'));
bound(
'y','lo') = smin(k,data(k,'y'));
bound(
'y','up') = smax(k,data(k,'y'));
display bound;

parameter M(*,*,s) 'tight big-M values';
M(
'y','>=',s) = -smin(k, data(k,'y') - data(s,'slope')*data(k,'x') - data(s,'intercept'));
M(
'y','<=',s) = smax(k, data(k,'y') - data(s,'slope')*data(k,'x') - data(s,'intercept'));
M(
'x','>=',s(k)) = max(0,data(k,'x')-bound('x','lo'));
M(
'x','<=',s(k)) = max(0,bound('x','up')-data(k+1,'x'));
display M;

parameter num(s) 'ordinal number of each s';
num(s) =
ord(s);

variable x,y;
binary variable delta(s) 'selection of segment';

equations
   ylo(s)
   yup(s)
   xlo(s)
   xup(k)
   select
;

ylo(s)..    y =g= data(s,
'slope')*x + data(s,'intercept') - M('y','>=',s)*(1-delta(s));
yup(s)..    y =l= data(s,
'slope')*x + data(s,'intercept') + M('y','<=',s)*(1-delta(s));
xlo(s)..    x =g= data(s,
'x') - M('x','>=',s)*(1-delta(s));
xup(s(k)).. x =l= data(k+1,
'x') + M('x','<=',s)*(1-delta(s));
select ..  
sum(s, delta(s)) =e= 1;

x.fx = 5;

model bigmbin /all/;
option optcr=0;
solve bigmbin using mip maximizing y;


The output is:



----     18 PARAMETER data  

             x           y          dx          dy       slope   intercept

k1       1.000       6.000       2.000      -4.000      -2.000       8.000
k2       3.000       2.000       3.000       6.000       2.000      -4.000
k3       6.000       8.000       4.000      -1.000      -0.250       9.500
k4      10.000       7.000


----     25 PARAMETER bound  implied bounds on x and y

           lo          up

x       1.000      10.000
y       2.000       8.000


----     32 PARAMETER M  tight big-M values

              k1          k2          k3

x.>=                   2.000       5.000
x.<=       7.000       4.000
y.>=                   9.000       6.750
y.<=      19.000       8.000


The generated model looks like:


---- ylo  =G=  

ylo(k1)..  2*x + y =G= 8 ; (LHS = 10)
     
ylo(k2)..  - 2*x + y - 9*delta(k2) =G= -13 ; (LHS = -10)
     
ylo(k3)..  0.25*x + y - 6.75*delta(k3) =G= 2.75 ; (LHS = 1.25, INFES = 1.5 ****)
     

---- yup  =L=  

yup(k1)..  2*x + y + 19*delta(k1) =L= 27 ; (LHS = 10)
     
yup(k2)..  - 2*x + y + 8*delta(k2) =L= 4 ; (LHS = -10)
     
yup(k3)..  0.25*x + y =L= 9.5 ; (LHS = 1.25)
     

---- xlo  =G=  

xlo(k1)..  x =G= 1 ; (LHS = 5)
     
xlo(k2)..  x - 2*delta(k2) =G= 1 ; (LHS = 5)
     
xlo(k3)..  x - 5*delta(k3) =G= 1 ; (LHS = 5)
     

---- xup  =L=  

xup(k1)..  x + 7*delta(k1) =L= 10 ; (LHS = 5)
     
xup(k2)..  x + 4*delta(k2) =L= 10 ; (LHS = 5)
     
xup(k3)..  x =L= 10 ; (LHS = 5)
     

---- select  =E=  

select..  delta(k1) + delta(k2) + delta(k3) =E= 1 ; (LHS = 0, INFES = 1 ****)