Yet Another Math Programming Consultant: Piecewise linear functions and formulations for interpolation (part 1)

Sunday, February 17, 2019

Piecewise linear functions and formulations for interpolation (part 1)

Pyomo has a tool to use different formulations for piecewise linear functions. From the help:

>>> from pyomo.core import *

>>> help(Piecewise)

Help on class Piecewise in module pyomo.core.base.piecewise:

class Piecewise(pyomo.core.base.block.Block)

| Piecewise(*args, **kwds)

| Adds piecewise constraints to a Pyomo model for functions of the

| form, y = f(x).

| Usage:

| model.const = Piecewise(index_1,...,index_n,yvar,xvar,**Keywords)

| model.const = Piecewise(yvar,xvar,**Keywords)

| Keywords:

| -pw_pts={},[],()

| A dictionary of lists (keys are index set) or a single list

| (for the non-indexed case or when an identical set of

| breakpoints is used across all indices) defining the set of

| domain breakpoints for the piecewise linear

| function. **ALWAYS REQUIRED**

| -pw_repn=''

| Indicates the type of piecewise representation to use. This

| can have a major impact on solver performance.

| Choices: (Default 'SOS2')

| ~ + 'SOS2' - Standard representation using sos2 constraints

| ~ 'BIGM_BIN' - BigM constraints with binary variables.

| Theoretically tightest M values are automatically

| determined.

| ~ 'BIGM_SOS1' - BigM constraints with sos1 variables.

| Theoretically tightest M values are automatically

| determined.

| ~*+ 'DCC' - Disaggregated convex combination model

| ~*+ 'DLOG' - Logarithmic disaggregated convex combination model

| ~*+ 'CC' - Convex combination model

| ~*+ 'LOG' - Logarithmic branching convex combination

| ~* 'MC' - Multiple choice model

| ~*+ 'INC' - Incremental (delta) method

| + Supports step functions

| * Source: "Mixed-Integer Models for Non-separable Piecewise Linear

| Optimization: Unifying framework and Extensions" (Vielma,

| Nemhauser 2008)

| ~ Refer to the optional 'force_pw' keyword.

| -pw_constr_type=''

| Indicates the bound type of the piecewise function.

| Choices:

-- More --

Some of these representations are not immediately obvious. Let's try to get a intuitive understanding of them. For a more formal discussion see [3].

I just consider the case I am using most often: a one-dimensional piecewise function $y = f(x)$. Note that in the context of an optimization model we need not necessarily to interpret this as: given an $x$, calculate an $y$, but rather as part of a system of equations. I.e. $x$ and $y$ are determined at same time (if $y$ is known, we find $x$, or even: we find $x,y$ simultaneously).

The piecewise linear function is defined by its breakpoints:

Some observations:

We have $K$ breakpoints $(\bar{x}_k,\bar{y}_k)$ and $K-1$ segments,
we assume $\bar{x}_k$ is ordered: $\bar{x}_{k+1}\ge \bar{x}_k$,
we assume we have lower and upper bounds on $x$ and $y$,
we assume the curve is connected (i.e. no forbidden regions for $x$, these have to be handled separately,
we allow step functions by adding a vertical segment (not all formulation allow this).

Finally, we note that interpolating between two points $(\bar{x}_1,\bar{y}_1)$, $(\bar{x}_2,\bar{y}_2)$ can be stated as choosing two weights $\lambda_1, \lambda_2$ with: \[\begin{align} & x = \lambda_1 \bar{x}_1 + \lambda_2 \bar{x}_2\\ & y = \lambda_1 \bar{y}_1 + \lambda_2 \bar{y}_2 \\ & \lambda_1+\lambda_2 = 1 \\ & \lambda_1, \lambda_2 \ge 0 \end{align}\]

Method 1: SOS2 formulation

This one is the easiest to remember. Many more advanced MIP solvers have facilities for SOS2 variables [4]. The definition is:

Let $x_1,\dots,x_n$ be members of a Special Ordered Set of Type 2 (SOS2), then only two variables $x_i$ can be nonzero and they have to be neighbors. All other variables are zero.

This looks like a strange animal, but in fact this construct is especially designed for cases like our piecewise linear interpolation scheme. Using SOS2 constraints, we can easily formulate a model:

SOS2 Formulation
\[\begin{align} & \color{DarkRed}x = \sum_k \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_k \\ & \color{DarkRed}y = \sum_k \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_k\\& \sum_k \color{DarkRed} \lambda_k = 1 \\ & \color{DarkRed}\lambda_k \ge 0 \\ & \mathit{SOS2}(\color{DarkRed}\lambda_1,\dots,\color{DarkRed}\lambda_K) \end{align}\]

Note that $\color{DarkRed}x$ and $\color{DarkRed}y$ are decision variables, while $\color{DarkBlue}{\bar{x}}_k$ and $\color{DarkBlue}{\bar{y}}_k$ are constants. The color coding helps to distinguish variables from data.

In the solution, two adjacent $\lambda_s,\lambda_{s+1}$ will be between 0 and 1, and for these we have $\lambda_s+\lambda_{s+1}=1$. All the others will be zero. I.e. we will be interpolating between the two corresponding breakpoints $(\bar{x}_s, \bar{y}_s)$ and $(\bar{x}_{s+1}, \bar{y}_{s+1})$. This SOS2 formulation does two things in one swoop: it selects the segment $s$ the variables $x$ and $y$ belong to, and it interpolates between the two selected breakpoints.

Notes:

This method is used quite a lot in practice, not in the least because it makes life easy for the modeler. It is useful to have this formulation in your toolbox.
This approach handles step functions without a problem.
There are no issues with big-M constants: there are none.
However, binary variables may offer performance benefits.

The Pyomo model is as follows:

#
# expected solution X=5, Y=6
#

xdata = [1., 3., 6., 10.]
ydata  = [6.,2.,8.,7.]

from pyomo.core import *

model = ConcreteModel()

model.X = Var(bounds=(1,10))
model.Y = Var(bounds=(0,100))

model.con = Piecewise(model.Y,model.X,
                      pw_pts=xdata,
                      pw_constr_type='EQ',
                      f_rule=ydata,
                      pw_repn='SOS2')

# see what we get for Y when X=5
def con2_rule(model):
    return model.X==5

model.con2 = Constraint(rule=con2_rule)

model.obj = Objective(expr=model.Y, sense=maximize)

It is also instructive to see how this can be modeled in a more traditional modeling language like GAMS:

set
k 'breakpoints' /point1*point4/
;
table data(k,*)
           x   y
point1   1   6
point2   3   2
point3   6   8
point4 10   7
;

sos2 variable lambda(k) 'sos2 variable';
variable x,y;
equations
   refrow    'reference row'
   funrow    'function row'
   convexity
;

refrow.. x =e= sum(k, lambda(k)*data(k,'x'));
funrow.. y =e= sum(k, lambda(k)*data(k,'y'));
convexity.. sum(k, lambda(k)) =e= 1;

x.fx = 5;

model m /all/;
option optcr=0;
solve m maximizing y using mip;
display x.l,y.l,lambda.l;

Results will look like:

----     29 VARIABLE x.L                   =        5.000  
            VARIABLE y.L                   =        6.000  

----     29 VARIABLE lambda.L  sos2 variable

point2 0.333,    point3 0.667

This is the correct solution when $x=5$.

Method 2: Big-M Binary Formulation

In this formulation we turn on or off each linear equality depending if we select a segment. Looking at our simple example, we can state the problem as: \[y = \begin{cases} -2x+8 & x \in [1,3]\\ 2x-4 & x \in [3,6]\\ -0.25x+9.5 & x \in [6,10]\end{cases}\] We can model this "case statement" with binary variables as follows: define \[\delta_s = \begin{cases} 1 & \text{if segment $s$ is selected}\\ 0 & \text{otherwise}\end{cases}\] When $\delta_s=0$ for a given segment $s$, we need to make the corresponding equation non-binding, and also the corresponding limits on $x$. This can be accomplished using inequalities and some big-M's. As an example, consider the second segment. We can write: \[\begin{align} & 2x-4-(1-\delta_2)M\le y \le 2x-4 +(1-\delta_2)M \\ & 3 - (1-\delta_2)M \le x \le 6+ (1-\delta_2)M\end{align}\] Finding good values for each $M$ is a bit tedious. Basically we have to choose the smallest $M$ such that each breakpoint is reachable (i.e. not cut off). For the constraint \[ y \le 2x-4 +(1-\delta_2)M\] we can write: \[ \begin{align} M & = \max_k \{\bar{y}_k - 2\bar{x}_k+4\}\\ & = \max \{ 6-2\cdot 1+4,2-2\cdot 3 + 4, 8-2\cdot 6 +4 , 7-2 \cdot 10+4 \} \\ & = \max \{8,0,0,-9\}\\&=8\end{align} \] The zero values correspond to breakpoints on the current segment.

A high level description can look like: \[\begin{align} &\delta_s = 1 \Rightarrow \begin{cases} y = a_s x + b_s \\ \bar{x}_s \le x \le \bar{x}_{s+1} \end{cases}\\ &\sum_s \delta_s = 1\\ &\delta_s \in \{0,1\}\end{align} \] where $a_s, b_s$ are the slope and intercept for the linear function in segment $s$. These coefficients can be calculated easily from the breakpoints $(\bar{x}_s,\bar{y}_s)$ and $(\bar{x}_{s+1},\bar{y}_{s+1})$. This implication can be implemented as follows:

Big-M Binary Formulation
\[\begin{align} & \color{DarkBlue}a_s \color{DarkRed}x + \color{DarkBlue}b_s - (1-\color{DarkRed}\delta_s) \color{DarkBlue} M \le \color{DarkRed}y \le \color{DarkBlue}a_s \color{DarkRed}x + \color{DarkBlue}b_s + (1-\color{DarkRed}\delta_s) \color{DarkBlue} M \\ & \color{DarkBlue}{\bar{x}}_s - (1-\color{DarkRed}\delta_s)\color{DarkBlue}M \le \color{DarkRed}x \le \color{DarkBlue}{\bar{x}}_{s+1} + (1-\color{DarkRed}\delta_s)\color{DarkBlue}M \\ & \sum_s \color{DarkRed} \delta_s=1\\ & \color{DarkRed}\delta_s \in \{0,1\}\\ & \color{DarkBlue}a_s = \frac{\color{DarkBlue}{\bar{y}}_{s+1}-\color{DarkBlue}{\bar{y}}_s}{\color{DarkBlue}{\bar{x}}_{s+1}-\color{DarkBlue}{\bar{x}}_s} \\ &\color{DarkBlue}b_s = \color{DarkBlue}{\bar{y}}_s - \color{DarkBlue} a_s \color{DarkBlue}{\bar{x}}_s \end{align}\]

Big-M Binary Formulation

\[\begin{align} & \color{DarkBlue}a_s \color{DarkRed}x + \color{DarkBlue}b_s - (1-\color{DarkRed}\delta_s) \color{DarkBlue} M \le \color{DarkRed}y \le \color{DarkBlue}a_s \color{DarkRed}x + \color{DarkBlue}b_s + (1-\color{DarkRed}\delta_s) \color{DarkBlue} M \\ & \color{DarkBlue}{\bar{x}}_s - (1-\color{DarkRed}\delta_s)\color{DarkBlue}M \le \color{DarkRed}x \le \color{DarkBlue}{\bar{x}}_{s+1} + (1-\color{DarkRed}\delta_s)\color{DarkBlue}M \\ & \sum_s \color{DarkRed} \delta_s=1\\ & \color{DarkRed}\delta_s \in \{0,1\}\\ & \color{DarkBlue}a_s = \frac{\color{DarkBlue}{\bar{y}}_{s+1}-\color{DarkBlue}{\bar{y}}_s}{\color{DarkBlue}{\bar{x}}_{s+1}-\color{DarkBlue}{\bar{x}}_s} \\ &\color{DarkBlue}b_s = \color{DarkBlue}{\bar{y}}_s - \color{DarkBlue} a_s \color{DarkBlue}{\bar{x}}_s \end{align}\]

This looks a bit more intimidating, but it is conceptually simple. Unfortunately we see some problems with the Pyomo implementation of this formulation.

Notes:

We need 4 inequalities per segment, and one binary variable.
This method assumes we can calculate a slope $a_s$. For step functions we have an infinite slope when there is a vertical segment. I.e. we cannot do step functions using this method.
It is imperative to choose reasonable values for the big-M's.
In the model above I just used one $M$, but actually each of them can have a different value.
This is not a formulation I typically use: it is somewhat cumbersome to setup.
Pyomo implements this method incorrectly, and we will get wrong results.

Method 3: Big-M SOS1 formulation

This is just a minor variant on the previous method. Instead of using binary variables, we use continuous variables that are members of a Special Ordered Set of Type 1. SOS1 sets are defined by:

Let $x_1,\dots,x_n$ be members of a Special Ordered Set of Type 1 (SOS1), then only one variable $x_i$ can be nonzero. All other variables are zero.

Basically we replace the binary variables: \[\begin{align} &\sum_s \delta_s = 1\\ & \delta_s \in \{0,1\}\end{align}\] by \[\begin{align} &\sum_s \lambda_s = 1 &\\ & \lambda_s \ge 0 \\ & \mathit{SOS1}(\lambda_1,\dots,\lambda_{K-1})\end{align}\] This does not change the model a lot:

Big-M SOS1 Formulation
\[\begin{align} & \color{DarkBlue}a_s \color{DarkRed}x + \color{DarkBlue}b_s - (1-\color{DarkRed}\lambda_s) \color{DarkBlue} M \le \color{DarkRed}y \le \color{DarkBlue}a_s \color{DarkRed}x + \color{DarkBlue}b_s + (1-\color{DarkRed}\lambda_s) \color{DarkBlue} M \\ & \color{DarkBlue}{\bar{x}}_s - (1-\color{DarkRed}\lambda_s)\color{DarkBlue}M \le \color{DarkRed}x \le \color{DarkBlue}{\bar{x}}_{s+1} + (1-\color{DarkRed}\delta_s)\color{DarkBlue}M \\ & \sum_s \color{DarkRed} \lambda_s=1\\ & \color{DarkRed}\lambda_s\ge 0\\& \mathit{SOS1}(\color{DarkRed}\lambda_1,\dots\,\color{DarkRed}\lambda_{K-1})\\ & \color{DarkBlue}a_s = \frac{\color{DarkBlue}{\bar{y}}_{s+1}-\color{DarkBlue}{\bar{y}}_s}{\color{DarkBlue}{\bar{x}}_{s+1}-\color{DarkBlue}{\bar{x}}_s} \\ &\color{DarkBlue}b_s = \color{DarkBlue}{\bar{y}}_s - \color{DarkBlue} a_s \color{DarkBlue}{\bar{x}}_s \end{align}\]

Big-M SOS1 Formulation

\[\begin{align} & \color{DarkBlue}a_s \color{DarkRed}x + \color{DarkBlue}b_s - (1-\color{DarkRed}\lambda_s) \color{DarkBlue} M \le \color{DarkRed}y \le \color{DarkBlue}a_s \color{DarkRed}x + \color{DarkBlue}b_s + (1-\color{DarkRed}\lambda_s) \color{DarkBlue} M \\ & \color{DarkBlue}{\bar{x}}_s - (1-\color{DarkRed}\lambda_s)\color{DarkBlue}M \le \color{DarkRed}x \le \color{DarkBlue}{\bar{x}}_{s+1} + (1-\color{DarkRed}\delta_s)\color{DarkBlue}M \\ & \sum_s \color{DarkRed} \lambda_s=1\\ & \color{DarkRed}\lambda_s\ge 0\\& \mathit{SOS1}(\color{DarkRed}\lambda_1,\dots\,\color{DarkRed}\lambda_{K-1})\\ & \color{DarkBlue}a_s = \frac{\color{DarkBlue}{\bar{y}}_{s+1}-\color{DarkBlue}{\bar{y}}_s}{\color{DarkBlue}{\bar{x}}_{s+1}-\color{DarkBlue}{\bar{x}}_s} \\ &\color{DarkBlue}b_s = \color{DarkBlue}{\bar{y}}_s - \color{DarkBlue} a_s \color{DarkBlue}{\bar{x}}_s \end{align}\]

Notes:

This method shows the same Pyomo bug as the big-M binary variable model: it is incorrectly implemented. See the analysis below for what is wrong.
If a solver support SOS1 variables, it also will typically support SOS2 variables. The SOS2 approach may be easier to use.
This method is supposedly marked for removal in future versions of Pyomo.
We can use the SOS1 variables to prevent the need for big-M's. We can write something like \[\begin{align} & a_s x+b_s -\lambda_s \le y \le a_s x+b_s +\lambda_s\\ & \bar{x}_s - \lambda_s \le x \le \bar{x}_{s+1} + \lambda_s \\ & \sum_s \delta_s = 1\\& \lambda_s \ge 0\\& \delta_s \in \{0,1\} \\ & \mathit{SOS1}(\lambda_s,\delta_s) \end{align} \] If you look at this carefully, you can see this implements the implications: \[\begin{align} & \delta_s = 1 \Rightarrow \begin{cases} y=a_s x + b_s \\ \bar{x}_s \le x \le \bar{x}_{s+1}\end{cases}\\ &\sum_s \delta_s = 1\\ & \delta_s \in \{0,1\} \end{align} \] This is similar to how a MIP solver may rewrite implications. This particular way of dealing with an implication is a useful concept to know.
I never have used the BIGM_SOS1 formulation in my models. It just does not look very appetizing.

A nasty Pyomo bug

To test the bigM-bin method on our simple data set I formulated the model: \[\begin{align}\max \> & y \\ & \text{piecewise on our data} \\ & x=5 \\ & x\in [1,10]\end{align}\] We expect to see as solution: $x=5, y=6$ and an objective of 6. Instead we see:

D:\Python\Python37\Scripts>pyomo solve --solver=glpk bigm.py

[ 0.00] Setting up Pyomo environment

[ 0.00] Applying Pyomo preprocessing actions

WARNING: DEPRECATED: The 'BIGM_BIN' and 'BIGM_SOS1' piecewise representations

will be removed in a future version of Pyomo. They produce incorrect

results in certain cases

[ 0.60] Creating model

[ 0.60] Applying solver

[ 0.83] Processing results

Number of solutions: 1

Solution Information

Gap: 0.0

Status: optimal

Function Value: 8.25

Solver results file: results.json

[ 0.86] Applying Pyomo postprocessing actions

[ 0.86] Pyomo Finished

D:\Python\Python37\Scripts>

Well, this is disappointing. We get a wrong solution!

We also get a warning that the Pyomo people are aware of problems with this method.

This is more wrong than what can be explained away with some numerical issues related to large big-M values. Large big-M values can lead to wrong results. I don't think that is the case here, but let's investigate.

The details of the Pyomo error

This section may become a bit dreary: we will dive into the details of this bug. Not for the faint of heart.

The model looks like the Pyomo model shown in the SOS2 section. The only difference is that we used a different method: BIGM_BIN instead of SOS2.

The solution file is:

{
    "Problem": [
        {
            "Lower bound": 8.25,
            "Name": "unknown",
            "Number of constraints": 9,
            "Number of nonzeros": 21,
            "Number of objectives": 1,
            "Number of variables": 6,
            "Sense": "maximize",
            "Upper bound": 8.25
        }
    ],
    "Solution": [
        {
            "number of solutions": 1,
            "number of solutions displayed": 1
        },
        {
            "Constraint": "No values",
            "Gap": 0.0,
            "Message": null,
            "Objective": {
                "obj": {
                    "Value": 8.25
                }
            },
            "Problem": {},
            "Status": "optimal",
            "Variable": {
                "X": {
                    "Value": 5.0
                },
                "Y": {
                    "Value": 8.25
                },
                "con.bin_y[3]": {
                    "Value": 1.0
                }
            }
        }
    ],
    "Solver": [
        {
            "Error rc": 0,
            "Statistics": {
                "Branch and bound": {
                    "Number of bounded subproblems": "1",
                    "Number of created subproblems": "1"
                }
            },
            "Status": "ok",
            "Termination condition": "optimal",
            "Time": 0.12992453575134277
        }
    ]
}

We see two problems:

Clearly the Y value is wrong: it should be 6 instead of 8.25.
We see X=5 and con.bin_y[3]=1. The last value indicates we are in the third segment. But the third segment is $[6,10]$ and does not include $x=5$.

To debug this, we can generate and inspect the LP file. This is a bit hard to read:

\* Source Pyomo model name=unknown *\

max 
obj:
+1 Y

s.t.

c_e_con2_:
+1 X
= 5

c_u_con_BIGM_constraint1(1)_:
+2 X
+1 Y
+19 con_bin_y(1)
<= 27

c_u_con_BIGM_constraint1(2)_:
-2 X
+1 Y
+8 con_bin_y(2)
<= 4

c_u_con_BIGM_constraint1(3)_:
+0.25 X
+1 Y
<= 9.5

c_e_con_BIGM_constraint2_:
+1 con_bin_y(1)
+1 con_bin_y(2)
+1 con_bin_y(3)
= 1

c_l_con_BIGM_constraint3(1)_:
+2 X
+1 Y
>= 8

c_u_con_BIGM_constraint3(2)_:
+2 X
-1 Y
+9 con_bin_y(2)
<= 13

c_u_con_BIGM_constraint3(3)_:
-0.25 X
-1 Y
+6.75 con_bin_y(3)
<= -2.75

c_e_ONE_VAR_CONSTANT: 
ONE_VAR_CONSTANT = 1.0

bounds
   1 <= X <= 10
    -inf <= Y <= +inf
   0 <= con_bin_y(1) <= 1
   0 <= con_bin_y(2) <= 1
   0 <= con_bin_y(3) <= 1
binary
  con_bin_y(1)
  con_bin_y(2)
  con_bin_y(3)
end

Let's compare carefully what we expect and what we see:

Constraint	Expected	Generated by Pyomo	Notes
Objective	\[\max\>y \]	max obj: +1 Y
Segment 1	\[ y \le -2x+8 +(1-\delta_1)M \]	+2X+1Y+19con_bin_y(1)<=27	$M=19$
	\[ y \ge -2x+8 -(1-\delta_1)M \]	+2X+1Y>=8	$M=0$
	\[ x \ge 1 -(1-\delta_1)M \]		This one is not needed.
	\[ x \le 3 + (1-\delta_1)M \]		$M=7$
Segment 2	\[ y \le 2x-4 +(1-\delta_2)M \]	-2X+1Y+8con_bin_y(2)<=4	$M=8$
	\[ y \ge 2x-4 -(1-\delta_2)M \]	+2X-1Y+9con_bin_y(2)<=13	$M=9$
	\[ x \ge 3 -(1-\delta_2)M \]		$M=2$
	\[ x \le 6 + (1-\delta_2)M \]		$M=4$
Segment 3	\[ y \le -0.25x+9.5 +(1-\delta_3)M \]	+0.25X+1Y<=9.5	$M=0$
	\[ y \ge -0.25x+9.5 -(1-\delta_3)M \]	0.25X-1Y+6.75con_bin_y(3)<=-2.75	$M=6.75$
	\[ x \ge 6 -(1-\delta_3)M \]		$M=5$
	\[ x \le 10 + (1-\delta_3)M \]		This one is not needed.
Sum	\[\sum_s \delta_s = 1\]	+1con_bin_y(1) +1con_bin_y(2) +1con_bin_y(3)=1
Fix	\[x=5\]	+1X=5

We see we have missing inequalities relating $x$ to the segments. The cells marked in yellow indicate the missing constraints. No wonder we got the wrong results. The omission to include the condition \[\delta_s = 1 \Rightarrow \bar{x}_s \le x \le \bar{x}_{s+1}\] explains our erroneous solution: we are looking at the wrong segment.

We cannot blame the big-M's for this problem. The Pyomo piecewise function is just forgetting to include a number of inequalities.

This was an interesting exercise: debugging without looking at the code.

References

Piecewise linear functions and formulations for interpolation (part 2). This post discusses formulations DCC, CC, MC and INCR. http://yetanothermathprogrammingconsultant.blogspot.com/2019/02/piecewise-linear-functions-and_22.html
Piecewise linear functions and formulations for interpolation (part 3). This post describes the LOG and DLOG formulations, which have a logarithmic number of binary variables. We see also some issues with the Pyomo implementation. http://yetanothermathprogrammingconsultant.blogspot.com/2019/03/piecewise-linear-functions-and.html
Juan Pablo Vielma, Shabbir Ahmed and George Nemhauser, Mixed-Integer Models for Nonseparable Piecewise Linear Optimization: Unifying Framework and Extensions, Operations Research 58(2):303-315, 2010
E.M.L. Beale, J.J.H. Forrest, Global Optimization Using Special Ordered Sets, Math Prog 10, 52-69, 1976

Appendix: script to calculate big-M's in BIGM_INT and BIGM_SOS1 models

It is easy to make mistakes in the calculate of the big M values for the BIGM_INT and BIGM_SOS1 models. I used the following model to calculate them:

set
k 'breakpoints' /k1*k4/
s(k) 'segments' /k1*k3/
;

table data(k,*)
       x   y
k1   1   6
k2   3   2
k3   6   8
k4 10   7
;

data(s(k),'dx') = data(k+1,'x')-data(k,'x');
data(s(k),'dy') = data(k+1,'y')-data(k,'y');
data(s(k),'slope') = data(k,'dy')/data(k,'dx');
data(s(k),'intercept') = data(k,'y')-data(k,'slope')*data(k,'x');
display data;

parameter bound(*,*) 'implied bounds on x and y';
bound('x','lo') = smin(k,data(k,'x'));
bound('x','up') = smax(k,data(k,'x'));
bound('y','lo') = smin(k,data(k,'y'));
bound('y','up') = smax(k,data(k,'y'));
display bound;

parameter M(*,*,s) 'tight big-M values';
M('y','>=',s) = -smin(k, data(k,'y') - data(s,'slope')*data(k,'x') - data(s,'intercept'));
M('y','<=',s) = smax(k, data(k,'y') - data(s,'slope')*data(k,'x') - data(s,'intercept'));
M('x','>=',s(k)) = max(0,data(k,'x')-bound('x','lo'));
M('x','<=',s(k)) = max(0,bound('x','up')-data(k+1,'x'));
display M;

parameter num(s) 'ordinal number of each s';
num(s) = ord(s);

variable x,y;
binary variable delta(s) 'selection of segment';

equations
   ylo(s)
   yup(s)
   xlo(s)
   xup(k)
   select
;

ylo(s)..    y =g= data(s,'slope')*x + data(s,'intercept') - M('y','>=',s)*(1-delta(s));
yup(s)..    y =l= data(s,'slope')*x + data(s,'intercept') + M('y','<=',s)*(1-delta(s));
xlo(s)..    x =g= data(s,'x') - M('x','>=',s)*(1-delta(s));
xup(s(k)).. x =l= data(k+1,'x') + M('x','<=',s)*(1-delta(s));
select ..   sum(s, delta(s)) =e= 1;

x.fx = 5;

model bigmbin /all/;
option optcr=0;
solve bigmbin using mip maximizing y;

The output is:

----     18 PARAMETER data  

             x           y          dx          dy       slope   intercept

k1       1.000       6.000       2.000      -4.000      -2.000       8.000
k2       3.000       2.000       3.000       6.000       2.000      -4.000
k3       6.000       8.000       4.000      -1.000      -0.250       9.500
k4      10.000       7.000


----     25 PARAMETER bound  implied bounds on x and y

           lo          up

x       1.000      10.000
y       2.000       8.000


----     32 PARAMETER M  tight big-M values

              k1          k2          k3

x.>=                   2.000       5.000
x.<=       7.000       4.000
y.>=                   9.000       6.750
y.<=      19.000       8.000

The generated model looks like:

---- ylo  =G=  

ylo(k1)..  2*x + y =G= 8 ; (LHS = 10)
     
ylo(k2)..  - 2*x + y - 9*delta(k2) =G= -13 ; (LHS = -10)
     
ylo(k3)..  0.25*x + y - 6.75*delta(k3) =G= 2.75 ; (LHS = 1.25, INFES = 1.5 ****)
     

---- yup  =L=  

yup(k1)..  2*x + y + 19*delta(k1) =L= 27 ; (LHS = 10)
     
yup(k2)..  - 2*x + y + 8*delta(k2) =L= 4 ; (LHS = -10)
     
yup(k3)..  0.25*x + y =L= 9.5 ; (LHS = 1.25)
     

---- xlo  =G=  

xlo(k1)..  x =G= 1 ; (LHS = 5)
     
xlo(k2)..  x - 2*delta(k2) =G= 1 ; (LHS = 5)
     
xlo(k3)..  x - 5*delta(k3) =G= 1 ; (LHS = 5)
     

---- xup  =L=  

xup(k1)..  x + 7*delta(k1) =L= 10 ; (LHS = 5)
     
xup(k2)..  x + 4*delta(k2) =L= 10 ; (LHS = 5)
     
xup(k3)..  x =L= 10 ; (LHS = 5)
     

---- select  =E=  

select..  delta(k1) + delta(k2) + delta(k3) =E= 1 ; (LHS = 0, INFES = 1 ****)

14 comments:

RobertFebruary 13, 2020 at 10:26 AM
I have a MILP model with some SOS2 constraints. I implement the SOS2 constraints with two approaches including SOS2 variable definition in GAMS, using native SOS2 provided by CPLEX, and a MILP technique enforcing SOS2 constraints using integer variables. While the both models follow the same branching procedure in the beginning, the native SOS2 presents a premature convergence albeit with zero integrality gap resulting a different solution compared to the MILP technique outcome. Actually, the MILP technique converges to a better solution. Interestingly, for a smaller case study the results are exactly similar maybe suggesting that both the models are equivalent. As the MILP model is convex in the relaxed form (relaxing integer values) it is expected that all MILP solution techniques converge to same solution. I could not found any parameters to set in SOS2 framework to further explore the problem space. What does the difference originate from?
ReplyDelete
Replies
RobertFebruary 14, 2020 at 12:41 AM
Many thanks for your attention. Unfortunately, SOS2 implemented by CPLEX in GAMS does not provide any parameter to tune such as acceptable relative gap or time limit. Furthermore, the SOS2 solution shows zero integrality gap in final results. On the other hand, the MIP model, here convex combination model of a pricewise linear function, also displays zero gap but a more optimal solution is achieved. The codes and the log files can be seen below. Please mechanically search 17980.804661 as the final solution of SOS2 technique, which I called "premature convergence". The solution of the convex combination MIP model also reaches to 17980.804661, here shows 0.24% gap, but continues until finding more optimal solution 17959.45. It is noted that the codes require two Microsoft excel files to run which I could not attach.
ReplyDelete
Replies
RobertFebruary 14, 2020 at 12:44 AM
This comment has been removed by the author.
ReplyDelete
Replies
RobertFebruary 14, 2020 at 12:47 AM
This comment has been removed by the author.
ReplyDelete
Replies
RobertFebruary 14, 2020 at 12:47 AM
This comment has been removed by the author.
ReplyDelete
Replies
RobertFebruary 15, 2020 at 1:10 AM
This comment has been removed by the author.
ReplyDelete
Replies
TWMay 7, 2020 at 11:44 AM
Thks for sharing, very helpful!! Did you notify Pyomo's authors of the root course to fix the problem instead of removing `BIGM_BIN` completely? There might be legitimate use cases for it. After all you did already most of the work '-).
ReplyDelete
Replies
UnknownMay 7, 2021 at 4:53 PM
Thanks Erwin for the comprehensive summary here and in parts 2 and 3 of all of the different pwl formulations. Regarding the posts regarding "premature" convergence and other similar issues of inconsistent solver results, the following tactics can help determine the source of the different results.

1) Check the solution quality of the different solutions. On some models, larger violations of the integrality tolerance that are still deemed feasible can yield significant changes in the objective.

2) Check the node logs for the first point at which you see an inconsistency; it can happen far before the final declarations of optimality.

3) Plug the "better" solution into the model that gave the "worse" solution. If it is not feasible, get an explanation of infeasibility (e.g. with CPLEX's conflict refiner or Gurobi's infeasibility finder). That can shed light on unintended differences in the model. If the models are sufficiently different that you cannot just fix solution values from one into the other, use a constraint on the objective function value instead.
ReplyDelete
Replies
AnonymousJune 25, 2021 at 3:31 AM
Thank you very much for this great explanations and examples! It helped me a lot to understand the difference between the formulations!
ReplyDelete
Replies

Add comment