In previous posts I discussed several Mixed Integer Programming formulations for piecewise linear functions [1,2]. In this final third part, I want to demonstrate some newer formulations, using a
logarithmic number of binary variables [6]. Finally we observe some problems with existing implementations in Pyomo.
In part 1 [1] you can find discussion and formulations for:
- SOS2: use SOS2 variables to model the piecewise linear functions. This is an easy modeling exercise.
- BIGM_BIN: use binary variables and big-M constraints to enable and disable constraints. This is a more complex undertaking, especially if we want to use the smallest possible values for the big-M constants. Pyomo has bug in this version.
- BIGM_SOS1: a slight variation of the BIGM_BIN model. Pyomo has the same problem with this model.
Part 2 [2] contains the following formulations:
- DCC: Disaggregated Convex Combination formulation is a simulation of the SOS2 model by binary variables.
- CC: Convex Combination formulation is a similar SOS2 like approach using binary variables only.
- MC: Multiple Choice model uses a semi-continuous approach.
- INCR: Incremental formulation is using all previous segments.
In this part I discuss:
- LOG: Formulation with a logarithmic number of binary variables based on the CC model.
- DLOG: DCC model with a logarithmic number of binary variables.
First I want to side step things, and talk about using binary variables to model integer values.
A side note: integer variables as a series of binary variables.
In MIP formulations for the Sudoku puzzle we expand an integer variable indicating the value of a cell \((i,j)\): \[v_{i,j} \in \{1,\dots,9\}\] into a series of binary variables: \[x_{i,j,k} = \begin{cases} 1 & \text{if cell $(i,j)$ has value $k$, for $k=1,\dots,9$}\\ 0 & \text{otherwise}\end{cases}\] This is done to make it easier to formulate "all-different" constraints. The value can be calculated as: \[v_{i,j} = \sum_k k \cdot x_{i,j,k}\] In this expansion we use a linear number of binary variables.
A more compact transformation is to use a power expansion. Here we translate an integer variable \(v \in \{0,\dots,U\}\) into a logarithmic number of binary variables: \[v = \sum_{i=1}^{1+\lfloor \log_2(U) \rfloor} 2^{i-1}x_i\] I.e., for a variable with an upperbound of \(U=100\), we need 7 binary variables. This idea seems to go back to [3]. I have seen this power expansion method being used to solve MIP problems with the ZOOM [4] solver, which only allowed continuous and binary variables.
|
We will use a different way to encode segment numbers. For this we need to know about Gray codes.
Another side note: binary encodings
In the previous section we used a standard binary encoding. A different encoding is using the Gray code [7]. This looks like:
Integer Value | Binary Code | Gray Code |
0 | 0000 | 0000 |
1 | 0001 | 0001 |
2 | 0010 | 0011 |
3 | 0011 | 0010 |
4 | 0100 | 0110 |
5 | 0101 | 0111 |
6 | 0110 | 0101 |
7 | 0111 | 0100 |
8 | 1000 | 1100 |
The standard binary code has the advantage there is a simple (linear) formula to go from binary to decimal which we can use in a MIP model. The Gray code has no simple formula (an algorithm must be used). The Gray code has as special property that each next code is just one bit different from the previous one. E.g. if we go from 7 to 8, the standard binary code differs in all 4 binary digits \(0111\rightarrow 1000\). The Gray code only differs in one binary digit: \(0100\rightarrow 1100\). |
With this we have enough to attack our piecewise linear interpolation problem.
Recap: the Convex Combination Model
The Convex Combination Model simulates SOS2 variables using binary variables:
CC - Convex Combination Formulation |
\[\begin{align} & \color{DarkRed}x = \sum_k \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_{k} \\ & \color{DarkRed}y = \sum_k \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_{k} \\ & \sum_k \color{DarkRed}\lambda_k = 1 \\& \color{DarkRed} \lambda_k \le \begin{cases} \color{DarkRed}\delta_k & \text{for $k=1$}\\ \color{DarkRed}\delta_{k-1} + \color{DarkRed}\delta_k & \text{for $k=2,\dots,K-1$} \\ \color{DarkRed}\delta_{k-1} & \text{for $k=K$}\end{cases} \\& \sum_s \color{DarkRed}\delta_s=1 \\ &
\color{DarkRed}\lambda_k \ge 0 \\ & \color{DarkRed}\delta_s \in \{0,1\}\\ &k \in \{1,\dots,\color{DarkBlue}K\},\> s \in \{1,\dots,\color{DarkBlue}K-1\} \end{align}\] |
This model will make sure only two consecutive \(\lambda_k\)'s can be nonzero (this is called a SOS2 constraint).
In this model we have \(K-1\) binary variables (\(K-1\) is the number of segments).
Numerical Example: what is the idea?
We can encode each segment using our Gray code \(000, 001, 011, 010, 110, \dots\) and associate a binary variable with each digit.
The variables have the following role:
- The \(\delta\) variables are binary variables used to indicate which segment is selected.
- The \(\lambda\) variables are continuous variables used to perform interpolation along the current segment.
In in the picture above, we can see that if the first digit is 0, we can exclude the last segment, and breakpoint \(\lambda_4\). If we do this a bit systematically, we can formulate the implications: \[\begin{align} & \delta_1=0 \Rightarrow \lambda_4=0\\& \delta_1=1 \Rightarrow \lambda_1=\lambda_2=0\\ & \delta_2=0 \Rightarrow \lambda_3=\lambda_4=0\\& \delta_2=1 \Rightarrow \lambda_1=0\end{align}\] or \[\begin{align} \lambda_4 &\le \delta_1 \\ \lambda_1+\lambda_2 &\le 1-\delta_1\\ \lambda_3+\lambda_4&\le \delta_2\\ \lambda_1 &\le 1-\delta_2\end{align}\]
Let's enumerate the results of these constraints. In the table below, we show all possible combinations for \(\delta_b, b=1,2\). For each combination we mark with a 0 which \(\lambda_k\)'s will be fixed to zero.
Gray code implications
\(\delta_1\) | \(\delta_
2\) | \(\lambda_1\) | \(\lambda_2\) | \(\lambda_3\) | \(\lambda_4\) |
0 | 0 | | | 0 | 0 |
0 | 1 | 0 | | | 0 |
1 | 1 | 0 | 0 | | |
1 | 0 | 0 | 0 | 0 | 0 |
We see that each combination of \(\delta\)'s selects the correct segment. E.g. \(\delta=(0,1)\) selects the segment defined by \(\lambda_2\) and \(\lambda_3\). In addition \(\delta=(1,0)\) is not feasible (the \(\lambda\)'s have to add up to one).
Why a Gray code?
The previous table explains the usefulness of the Gray code. If we would have used the standard binary encoding, we would not see the same table. Using the encoding: segment1 = 00, segment2 = 01, segment3 = 10, we see:
Standard binary code implications
\(\delta_1\) | \(\delta_
2\) | \(\lambda_1\) | \(\lambda_2\) | \(\lambda_3\) | \(\lambda_4\) |
0 | 0 | | | | 0 |
0 | 1 | 0 | | | 0 |
1 | 0 | 0 | 0 | | |
1 | 1 | 0 | 0 | | 0 |
This is not useful for our purposes. We want to allow two adjacent \(\lambda\)'s to be nonzero.
Developing a model
To make a model out of this, we define a boolean incidence matrix \[I_{b,v,k} = \begin{cases} \text{true} & \text{if breakpoint $k$ is part of a segment $s$ that has binary digit $b$ equal to $v$}\\ \text{false} & \text{otherwise}\end{cases}\] where \(v \in \{0,1\}\). For our example we have:
---- 36 SET incident
point1 point2 point3 point4
b1.0 YES YES YES
b1.1 YES YES
b2.0 YES YES
b2.1 YES YES YES
With this the model can look like:
LOG - Logarithmic model |
\[\begin{align} & \color{DarkRed}x = \sum_k \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_{k} \\ & \color{DarkRed}y = \sum_k \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_{k} \\ & \sum_k \color{DarkRed}\lambda_k = 1 \\& \sum_{k | \text{not } I(b,0,k)} \color{DarkRed} \lambda_k \le \color{DarkRed} \delta_b && \forall b \\& \sum_{k | \text{not } I(b,1,k)} \color{DarkRed}\lambda_k\le 1-\color{DarkRed}\delta_b && \forall b\\ &
\color{DarkRed}\lambda_k \ge 0 \\ & \color{DarkRed}\delta_b \in \{0,1\}\\ & k \in \{1,\dots,\color{DarkBlue}K\}\\ & s \in \{1,\dots,\color{DarkBlue}K-1\}\\ & b \in \{1,\dots,\lceil log_2(\color{DarkBlue}K-1) \rceil \} \end{align}\] |
The GAMS model
The GAMS model can look like:
set
k 'breakpoints' /point1*point4/
s 'segments' /segment1*segment3/
sk(s,k) 'mapping'
b 'number of binary variables' /b1,b2/
b01 /0,1/
;
* calculate mapping between breakpoints and segments
sk(s,k) = ord(s)=ord(k) or ord(s)=ord(k)-1;
display sk;
table data(k,*) 'location of breakpoints'
x y
point1 1
6
point2 3
2
point3 6
8
point4 10
7
;
table gray(s,b) 'Gray encoding of segments'
b1 b2
segment1 0
0
segment2 0
1
segment3 1
1
;
set
incident(b,b01,k)
notincident(b,b01,k)
;
incident(b,'1',k) = sum(sk(s,k),gray(s,b));
incident(b,'0',k) = sum(sk(s,k),1-gray(s,b));
notincident(b,b01,k) = not incident(b,b01,k);
display
incident,notincident;
positive variable lambda(k) 'interpolation';
binary variable delta(b) 'segment encoding';
variable x,y;
equations
xdef 'x'
ydef 'y'
link0(b) 'link delta-lambda'
link1(b) 'link delta-lambda'
sumlambda 'interpolation'
;
xdef.. x =e= sum(k, lambda(k)*data(k,'x'));
ydef.. y =e= sum(k, lambda(k)*data(k,'y'));
link0(b).. sum(notincident(b,'0',k),lambda(k)) =l= delta(b);
link1(b).. sum(notincident(b,'1',k),lambda(k)) =l= 1-delta(b);
sumlambda.. sum(k, lambda(k)) =e= 1;
x.fx = 5;
model m /all/;
option optcr=0;
solve m maximizing
y using mip;
display
x.l,y.l,delta.l,lambda.l;
|
The results look like:
---- 61 VARIABLE x.L = 5.000
VARIABLE y.L = 6.000
---- 61 VARIABLE delta.L segment encoding
b2 1.000
---- 61 VARIABLE lambda.L interpolation
point2 0.333, point3 0.667
This indicates we selected segment 2 with breakpoints 2 and 3. We also have \(\delta = (0,1)\). The generated equations look like:
---- xdef =E= x
xdef.. - lambda(point1) - 3*lambda(point2) - 6*lambda(point3) - 10*lambda(point4) + x =E= 0 ; (LHS = 5, INFES = 5 ****)
---- ydef =E= y
ydef.. - 6*lambda(point1) - 2*lambda(point2) - 8*lambda(point3) - 7*lambda(point4) + y =E= 0 ; (LHS = 0)
---- link0 =L= link delta-lambda
link0(b1).. lambda(point4) - delta(b1) =L= 0 ; (LHS = 0)
link0(b2).. lambda(point3) + lambda(point4) - delta(b2) =L= 0 ; (LHS = 0)
---- link1 =L= link delta-lambda
link1(b1).. lambda(point1) + lambda(point2) + delta(b1) =L= 1 ; (LHS = 0)
link1(b2).. lambda(point1) + delta(b2) =L= 1 ; (LHS = 0)
---- sumlambda =E= interpolation
sumlambda.. lambda(point1) + lambda(point2) + lambda(point3) + lambda(point4) =E= 1 ; (LHS = 0, INFES = 1 ****)
Of course, a logarithmic model does not make much sense for a model with just 3 segments. We just save 1 binary variable: instead of \(K-1=3\) binary variables, we have \(\lceil \log_2(K-1)\rceil = 2\) binary variables.
Pyomo
The Pyomo model for this problem can look like:
#
# expected solution X=5, Y=6
#
xdata = [1., 3., 6., 10.]
ydata = [6.,2.,8.,7.]
from pyomo.core import *
model = ConcreteModel()
model.X = Var(bounds=(1,10))
model.Y = Var(bounds=(0,100))
model.con = Piecewise(model.Y,model.X,
pw_pts=xdata,
pw_constr_type='EQ',
f_rule=ydata,
pw_repn='LOG')
# see what we get for Y when X=5
def con2_rule(model):
return model.X==5
model.con2 = Constraint(rule=con2_rule)
model.obj = Objective(expr=model.Y, sense=maximize)
The output is:
[ 0.00]
Setting up Pyomo environment
[ 0.00]
Applying Pyomo preprocessing actions
ERROR: Constructing component 'con' from data=None
failed: ValueError: 'con'
does not
have a list of domain points with length (2^n)+1
[ 0.02]
Pyomo Finished
ERROR: Unexpected exception while loading model:
'con'
does not have a list of domain points with length (2^n)+1
|
Bummer. You probably have to pad the breakpoints until we have a number of segments that is equal to a power of two. I don't understand this requirement. As we saw before, there really is no good reason for this restriction. As a workaround, we can do this by just repeating the last points as many times as needed:
xdata = [1., 3., 6., 10., 10.]
ydata = [6.,2.,8.,7.,7.]
The DLOG formulation.
The DCC model discussed in [2] looks like:
DCC - Disaggregated Convex Combination Formulation |
\[\begin{align} & \color{DarkRed}x = \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_{s,k} \\ & \color{DarkRed}y = \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_{s,k}\\& \color{DarkRed}\delta_s = \sum_{k|\color{DarkBlue}SK(s,k)} \color{DarkRed} \lambda_{s,k} \\& \sum_s \color{DarkRed}\delta_s=1 \\ &
\color{DarkRed}\lambda_{s,k} \ge 0 \\ & \color{DarkRed}\delta_s \in \{0,1\} \end{align}\] |
This model can be adapted to use the logarithmic encoding trick discussed before. We use a similar incidence table as before, but now it refers to \(\lambda_{s,k}\). I.e.: \[I_{b,v,s,k} = \begin{cases} \text{true} & \text{if breakpoint $k$ is part of a segment $s$ that has binary digit $b$ equal to $v$}\\ \text{false} & \text{otherwise}\end{cases}\] where \(v \in \{0,1\}\).
This yields a model like:
DLOG - DCC Formulation with logarithmic number of binary variables. |
\[\begin{align} & \color{DarkRed}x = \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_{s,k} \\ & \color{DarkRed}y = \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_{s,k} \\ & \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkRed}\lambda_{s,k} = 1 \\& \sum_{s,k | \text{not } I(b,0,s,k)} \color{DarkRed} \lambda_{s,k} \le \color{DarkRed}\delta_b && \forall b \\& \sum_{s,k | \text{not } I(b,1,s,k)} \color{DarkRed} \lambda_{s,k} \le 1- \color{DarkRed}\delta_b && \forall b \\ &
\color{DarkRed}\lambda_{s,k} \ge 0 \\ & \color{DarkRed}\delta_b \in \{0,1\}\\ & k \in \{1,\dots,\color{DarkBlue}K\}\\ & s \in \{1,\dots,\color{DarkBlue}K-1\}\\ & b \in \{1,\dots,\lceil log_2(\color{DarkBlue}K-1) \rceil \} \end{align}\] |
The GAMS model can look like:
set
k 'breakpoints' /point1*point4/
s 'segments' /segment1*segment3/
sk(s,k) 'mapping'
b 'number of binary variables' /b1,b2/
b01 /0,1/
;
* calculate mapping between breakpoints and segments
sk(s,k) = ord(s)=ord(k) or ord(s)=ord(k)-1;
display sk;
table data(k,*) 'location of breakpoints'
x y
point1 1
6
point2 3
2
point3 6
8
point4 10
7
;
table gray(s,b) 'Gray encoding of segments'
b1 b2
segment1 0
0
segment2 0
1
segment3 1
1
;
set
incident(b,b01,s,k)
notincident(b,b01,s,k)
;
incident(b,'1',sk(s,k))
= gray(s,b);
incident(b,'0',sk(s,k))
= 1-gray(s,b);
notincident(b,b01,sk) = not incident(b,b01,sk);
display
incident,notincident;
positive variable lambda(s,k) 'interpolation';
binary variable delta(b) 'segments are Gray encoded';
variable x,y;
equations
xdef 'x'
ydef 'y'
link0(b) 'link delta-lambda'
link1(b) 'link delta-lambda'
sumlambda 'select segment'
;
xdef.. x =e= sum(sk(s,k), lambda(s,k)*data(k,'x'));
ydef.. y =e= sum(sk(s,k), lambda(s,k)*data(k,'y'));
link0(b).. sum(notincident(b,'0',s,k),lambda(s,k)) =l= delta(b);
link1(b).. sum(notincident(b,'1',s,k),lambda(s,k)) =l= 1-delta(b);
sumlambda.. sum(sk(s,k),lambda(s,k)) =e= 1;
x.fx = 5;
model m /all/;
option optcr=0;
solve m maximizing
y using mip;
display
x.l,y.l,delta.l,lambda.l;
|
The solution looks like:
---- 61 VARIABLE x.L = 5.000
VARIABLE y.L = 6.000
---- 61 VARIABLE delta.L segments are Gray encoded
b2 1.000
---- 61 VARIABLE lambda.L interpolation
point2 point3
segment2 0.333 0.667
This means \(\delta = (0,1)\) which gives us segment 2.
We can also show the individual equations:
---- xdef =E= x
xdef.. - lambda(segment1,point1) - 3*lambda(segment1,point2) - 3*lambda(segment2,point2) - 6*lambda(segment2,point3)
- 6*lambda(segment3,point3) - 10*lambda(segment3,point4) + x =E= 0 ; (LHS = 5, INFES = 5 ****)
---- ydef =E= y
ydef.. - 6*lambda(segment1,point1) - 2*lambda(segment1,point2) - 2*lambda(segment2,point2) - 8*lambda(segment2,point3)
- 8*lambda(segment3,point3) - 7*lambda(segment3,point4) + y =E= 0 ; (LHS = 0)
---- link0 =L= link delta-lambda
link0(b1).. lambda(segment3,point3) + lambda(segment3,point4) - delta(b1) =L= 0 ; (LHS = 0)
link0(b2).. lambda(segment2,point2) + lambda(segment2,point3) + lambda(segment3,point3) + lambda(segment3,point4)
- delta(b2) =L= 0 ; (LHS = 0)
---- link1 =L= link delta-lambda
link1(b1).. lambda(segment1,point1) + lambda(segment1,point2) + lambda(segment2,point2) + lambda(segment2,point3)
+ delta(b1) =L= 1 ; (LHS = 0)
link1(b2).. lambda(segment1,point1) + lambda(segment1,point2) + delta(b2) =L= 1 ; (LHS = 0)
---- sumlambda =E= select segment
sumlambda.. lambda(segment1,point1) + lambda(segment1,point2) + lambda(segment2,point2) + lambda(segment2,point3)
+ lambda(segment3,point3) + lambda(segment3,point4) =E= 1 ; (LHS = 0, INFES = 1 ****)
Pyomo
Again when we try Pyomo, we get into trouble:
#
# expected solution X=5, Y=6
#
xdata = [1., 3., 6., 10.]
ydata = [6.,2.,8.,7.]
from pyomo.core import *
model = ConcreteModel()
model.X = Var(bounds=(1,10))
model.Y = Var(bounds=(0,100))
model.con = Piecewise(model.Y,model.X,
pw_pts=xdata,
pw_constr_type='EQ',
f_rule=ydata,
pw_repn='DLOG')
# see what we get for Y when X=5
def con2_rule(model):
return model.X==5
model.con2 = Constraint(rule=con2_rule)
model.obj = Objective(expr=model.Y, sense=maximize)
yields the following messages:
[ 0.00]
Setting up Pyomo environment
[ 0.00]
Applying Pyomo preprocessing actions
ERROR: Constructing component 'con' from data=None
failed: ValueError: 'con'
does not
have a list of domain points with length (2^n)+1
[ 0.02]
Pyomo Finished
ERROR: Unexpected exception while loading model:
'con'
does not have a list of domain points with length (2^n)+1
|
I am not sure what the reason can be for this restriction.
Conclusion
This is hopefully a somewhat more accessible description of methods for piecewise linear interpolation with just a logarithmic number of binary variables. This method can be applied to many formulations. Here we just showed two: the Convex Combination (CC) and the Disaggregated Convex Combination (DCC) formulation. Finally. we demonstrated some problems in the Pyomo implementation.
References
- Piecewise linear functions and formulations for interpolation (part 1), http://yetanothermathprogrammingconsultant.blogspot.com/2019/02/piecewise-linear-functions-and.html
- Piecewise linear functions and formulations for interpolation (part 2), http://yetanothermathprogrammingconsultant.blogspot.com/2019/02/piecewise-linear-functions-and_22.html
- L. J. Watters, Reduction of integer polynomial programming problems to zero-one linear programming
problems, Operations Research 15, 1171–1174, 1967
- Jaya Singhal, Roy E. Marsten, Thomas L. Morin, Fixed Order Branch-and-Bound Methods for Mixed-Integer Programming: The zoom System, ORSA Journal on Computing, 1989, vol. 1, issue 1, 44-51.
- Juan Pablo Vielma, Shabbir Ahmed and George Nemhauser, Mixed-Integer Models for Nonseparable Piecewise Linear Optimization: Unifying Framework and Extensions, Operations Research 58(2):303-315, 2010
- Juan Pablo Vielma and George L. Nemhauser, Modelling Disjunctive Constraints
with a Logarithmic Number of Binary Variables and Constraints, Mathematical Programming 128 (2011), pp. 49-72.
- Gray Code, https://en.wikipedia.org/wiki/Gray_code