## Tuesday, March 19, 2019

### Visualization of a large scheduling solution

Some optimization models have very "small" solutions. May be just: yes this is feasible. Many scheduling problems, however, have large solutions with thousands of entries. One of the problems we face when developing larger scheduling problems is: how do we look at this large solution? The solution data is too large to eye ball, and looking at tables with 0-1 decision variables is very difficult.

In this model, the main variable is x which is a binary variable indicating if a person is assigned to a facility at time t. The GAMS listing file is not a very convenient way to look at this:

 ---- VAR x  assign provider to facility                                                  LOWER          LEVEL          UPPER         MARGINAL 5   .7  .Hospitalist          .04/01/2019          .              .             1.0000         3.0000      5   .7  .Hospitalist          .04/02/2019          .              .             1.0000         3.0000      5   .7  .Hospitalist          .04/03/2019          .              .             1.0000         3.0000      5   .7  .Hospitalist          .04/04/2019          .              .             1.0000         9.0000      5   .7  .Hospitalist          .04/05/2019          .              .             1.0000         9.0000      5   .7  .Hospitalist          .04/06/2019          .              .             1.0000         9.0000      5   .7  .Hospitalist          .04/07/2019          .              .             1.0000         9.0000      5   .7  .Hospitalist          .04/08/2019          .              .             1.0000         9.0000      5   .7  .Hospitalist          .04/09/2019          .              .             1.0000         9.0000      5   .7  .Hospitalist          .04/10/2019          .              .             1.0000         9.0000      5   .7  .Hospitalist          .04/11/2019          .              .             1.0000         3.0000      5   .7  .Hospitalist          .04/12/2019          .              .             1.0000         3.0000

I just showed a few records: there are more than 40,000 of them. Obviously, this is fairly useless for our application.

The GDX data viewer is showing by default:

 Default view
This is not much better. We can improve this a bit by pivoting: only look at the levels and display the date dimension as columns:

 Pivot table
These are still raw results. We see we have numeric id's instead of more meaningful names. The reason to use the id's is that they are guaranteed unique (they are keys).

To be able get a more compact and meaningful visualization of the results we need to post-process the results. E.g. we prefer names instead of id's. Here I show a visualization in Excel, where we can create different views (using the auto-filter facility in Excel). Note: I have anonymized the names in this demo (in the real application we have real names to add immediate context).

 Excel based visualization

Not only is this effort useful for viewing and understanding the results (important for debugging and improving the optimization model), but it also allows me to communicate results with the client. Although I ship results in the form of CSV files (to be consumed by a database tool), this CSV file is difficult to interpret directly. This Excel spreadsheet allows us to look at our large, complex solution in much more efficient and effective way. Of course, I have automated things, so after an optimization run, I can generate this spreadsheet using a script. Well, rather two scripts: one is written in GAMS and generates a large sparse data cube (with about 50k entries; later versions have about 100k entries), the second one is written in VBA and does the coloring among other things.

The advantage of using Excel instead of using a dedicated tool, is that it makes it easier to share results. In the business world, practically everyone is familiar with Excel. Emailing spreadsheets around is much more accepted than sending say executables.

Although creating this tool is not exactly cheap, being able to inspect results in a meaningful way and communicate about them is a big pay off. It is not unusual these tools develop into a more important decision support tool than just a debugging aid for an optimization model.

As an aside: developing this optimization model turned out to be an excellent tool to improve the underlying database. Optimization models look at all data in context, instead of just looking at it record by record. As we are solving large systems of simultaneous equations, we look at data as a whole: everything has to be up to snuff. This will put lots of emphasis on correct data, much more than traditional database applications. Data that initially looks quite decent, may actually contain more errors and inconsistencies than you would think. An optimization model, being very picky about the data, is a good tool to unearth these. So if you want to check your data: build some optimization models!

Developing a mathematical programming application is often much more than just doing mathematical programming.

## Thursday, March 14, 2019

### Piecewise linear functions and formulations for interpolation (part 3)

In previous posts I discussed several Mixed Integer Programming formulations for piecewise linear functions [1,2]. In this final third part, I want to demonstrate some newer formulations, using a logarithmic number of binary variables [6]. Finally we observe some problems with existing implementations in Pyomo.

In part 1 [1] you can find discussion and formulations for:

1. SOS2: use SOS2 variables to model the piecewise linear functions. This is an easy modeling exercise.
2. BIGM_BIN: use binary variables and big-M constraints to enable and disable constraints. This is a more complex undertaking, especially if we want to use the smallest possible values for the big-M constants. Pyomo has bug in this version.
3. BIGM_SOS1: a slight variation of the BIGM_BIN model. Pyomo has the same problem with this model.
Part 2 [2] contains the following formulations:

1. DCC: Disaggregated Convex Combination formulation is a simulation of the SOS2 model by binary variables.
2. CC:  Convex Combination formulation is a similar SOS2 like approach using binary variables only.
3. MC: Multiple Choice model uses a semi-continuous approach.
4. INCR: Incremental formulation is using all previous segments.

In this part I discuss:

1. LOG: Formulation with a logarithmic number of binary variables based on the CC model.
2. DLOG: DCC model with a logarithmic number of binary variables.

First I want to side step things, and talk about using binary variables to model integer values.

#### A side note: integer variables as a series of binary variables.

In MIP formulations for the Sudoku puzzle we expand an integer variable indicating the value of a cell $$(i,j)$$: $v_{i,j} \in \{1,\dots,9\}$ into a series of binary variables: $x_{i,j,k} = \begin{cases} 1 & \text{if cell (i,j) has value k, for k=1,\dots,9}\\ 0 & \text{otherwise}\end{cases}$ This is done to make it easier to formulate "all-different" constraints.  The value can be calculated as: $v_{i,j} = \sum_k k \cdot x_{i,j,k}$ In this expansion we use a linear number of binary variables.

A more compact transformation is to use a power expansion. Here we translate an integer variable $$v \in \{0,\dots,U\}$$ into a logarithmic number of binary variables: $v = \sum_{i=1}^{1+\lfloor \log_2(U) \rfloor} 2^{i-1}x_i$ I.e., for a variable with an upperbound of $$U=100$$, we need 7 binary variables. This idea seems to go back to [3]. I have seen this power expansion method being used to solve MIP problems with the ZOOM [4] solver, which only allowed continuous and binary variables.

We will use a different way to encode segment numbers. For this we need to know about Gray codes.

#### Another side note: binary encodings

In the previous section we used a standard binary encoding. A different encoding is using the Gray code [7]. This looks like:

Integer ValueBinary CodeGray Code
000000000
100010001
200100011
300110010
401000110
501010111
601100101
701110100
810001100

The standard binary code has the advantage there is a simple (linear) formula to go from binary to decimal which we can use in a MIP model. The Gray code has no simple formula (an algorithm must be used). The Gray code has as special property that each next code is just one bit different from the previous one. E.g. if we go from 7 to 8, the standard binary code differs in all 4 binary digits $$0111\rightarrow 1000$$. The Gray code only differs in one binary digit: $$0100\rightarrow 1100$$.

With this we have enough to attack our piecewise linear interpolation problem.

#### Recap: the Convex Combination Model

The Convex Combination Model simulates SOS2 variables using binary variables:

CC - Convex Combination Formulation
\begin{align} & \color{DarkRed}x = \sum_k \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_{k} \\ & \color{DarkRed}y = \sum_k \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_{k} \\ & \sum_k \color{DarkRed}\lambda_k = 1 \\& \color{DarkRed} \lambda_k \le \begin{cases} \color{DarkRed}\delta_k & \text{for k=1}\\ \color{DarkRed}\delta_{k-1} + \color{DarkRed}\delta_k & \text{for k=2,\dots,K-1} \\ \color{DarkRed}\delta_{k-1} & \text{for k=K}\end{cases} \\& \sum_s \color{DarkRed}\delta_s=1 \\ & \color{DarkRed}\lambda_k \ge 0 \\ & \color{DarkRed}\delta_s \in \{0,1\}\\ &k \in \{1,\dots,\color{DarkBlue}K\},\> s \in \{1,\dots,\color{DarkBlue}K-1\} \end{align}

This model will make sure only two consecutive $$\lambda_k$$'s can be nonzero (this is called a SOS2 constraint).

In this model we have $$K-1$$ binary variables ($$K-1$$ is the number of segments).

#### Numerical Example: what is the idea?

We can encode each segment using our Gray code $$000, 001, 011, 010, 110, \dots$$ and associate a binary variable with each digit.

The variables have the following role:

1. The $$\delta$$ variables are binary variables used to indicate which segment is selected.
2. The $$\lambda$$ variables are continuous variables used to perform interpolation along the current segment.

In in the picture above, we can see that if the first digit is 0, we can exclude the last segment, and breakpoint $$\lambda_4$$. If we do this a bit systematically, we can formulate the implications: \begin{align} & \delta_1=0 \Rightarrow \lambda_4=0\\& \delta_1=1 \Rightarrow \lambda_1=\lambda_2=0\\ & \delta_2=0 \Rightarrow \lambda_3=\lambda_4=0\\& \delta_2=1 \Rightarrow \lambda_1=0\end{align} or \begin{align} \lambda_4 &\le \delta_1 \\ \lambda_1+\lambda_2 &\le 1-\delta_1\\ \lambda_3+\lambda_4&\le \delta_2\\ \lambda_1 &\le 1-\delta_2\end{align}

Let's enumerate the results of these constraints. In the table below, we show all possible combinations for $$\delta_b, b=1,2$$. For each combination we mark with a 0 which $$\lambda_k$$'s will be fixed to zero.

Gray code implications
$$\delta_1$$$$\delta_ 2$$$$\lambda_1$$$$\lambda_2$$$$\lambda_3$$$$\lambda_4$$
0000
0100
1100
100000

We see that each combination of $$\delta$$'s selects the correct segment. E.g. $$\delta=(0,1)$$ selects the segment defined by $$\lambda_2$$ and $$\lambda_3$$. In addition $$\delta=(1,0)$$ is not feasible (the $$\lambda$$'s have to add up to one).

#### Why a Gray code?

The previous table explains the usefulness of the Gray code. If we would have used the standard binary encoding, we would not see the same table. Using the encoding: segment1 = 00, segment2 = 01, segment3 = 10, we see:

Standard binary code implications
$$\delta_1$$$$\delta_ 2$$$$\lambda_1$$$$\lambda_2$$$$\lambda_3$$$$\lambda_4$$
00 0
010 0
1000
11000

This is not useful for our purposes. We want to allow two adjacent $$\lambda$$'s to be nonzero.

#### Developing a model

To make a model out of this, we define a boolean incidence matrix $I_{b,v,k} = \begin{cases} \text{true} & \text{if breakpoint k is part of a segment s that has binary digit b equal to v}\\ \text{false} & \text{otherwise}\end{cases}$ where $$v \in \{0,1\}$$. For our example we have:

----     36 SET incident

point1      point2      point3      point4

b1.0         YES         YES         YES
b1.1                                 YES         YES
b2.0         YES         YES
b2.1                     YES         YES         YES


With this the model can look like:

LOG - Logarithmic model
\begin{align} & \color{DarkRed}x = \sum_k \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_{k} \\ & \color{DarkRed}y = \sum_k \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_{k} \\ & \sum_k \color{DarkRed}\lambda_k = 1 \\& \sum_{k | \text{not } I(b,0,k)} \color{DarkRed} \lambda_k \le \color{DarkRed} \delta_b && \forall b \\& \sum_{k | \text{not } I(b,1,k)} \color{DarkRed}\lambda_k\le 1-\color{DarkRed}\delta_b && \forall b\\ & \color{DarkRed}\lambda_k \ge 0 \\ & \color{DarkRed}\delta_b \in \{0,1\}\\ & k \in \{1,\dots,\color{DarkBlue}K\}\\ & s \in \{1,\dots,\color{DarkBlue}K-1\}\\ & b \in \{1,\dots,\lceil log_2(\color{DarkBlue}K-1) \rceil \} \end{align}

#### The GAMS model

The GAMS model can look like:

 set   k 'breakpoints' /point1*point4/   s 'segments' /segment1*segment3/   sk(s,k) 'mapping'   b 'number of binary variables' /b1,b2/   b01 /0,1/ ; * calculate mapping between breakpoints and segments sk(s,k) = ord(s)=ord(k) or ord(s)=ord(k)-1; display sk; table data(k,*) 'location of breakpoints'            x   y   point1   1   6   point2   3   2   point3   6   8   point4  10   7 ; table gray(s,b) 'Gray encoding of segments'              b1  b2    segment1   0   0    segment2   0   1    segment3   1   1 ; set   incident(b,b01,k)   notincident(b,b01,k) ; incident(b,'1',k) = sum(sk(s,k),gray(s,b)); incident(b,'0',k) = sum(sk(s,k),1-gray(s,b)); notincident(b,b01,k) = not incident(b,b01,k); display incident,notincident; positive variable lambda(k) 'interpolation'; binary variable delta(b) 'segment encoding'; variable x,y; equations    xdef       'x'    ydef       'y'    link0(b)   'link delta-lambda'    link1(b)   'link delta-lambda'    sumlambda  'interpolation' ; xdef.. x =e= sum(k, lambda(k)*data(k,'x')); ydef.. y =e= sum(k, lambda(k)*data(k,'y')); link0(b)..  sum(notincident(b,'0',k),lambda(k)) =l= delta(b); link1(b)..  sum(notincident(b,'1',k),lambda(k)) =l= 1-delta(b); sumlambda.. sum(k, lambda(k)) =e= 1; x.fx = 5; model m /all/; option optcr=0; solve m maximizing y using mip; display x.l,y.l,delta.l,lambda.l;

The results look like:

----     61 VARIABLE x.L                   =        5.000
VARIABLE y.L                   =        6.000

----     61 VARIABLE delta.L  segment encoding

b2 1.000

----     61 VARIABLE lambda.L  interpolation

point2 0.333,    point3 0.667


This indicates we selected segment 2 with breakpoints 2 and 3. We also have $$\delta = (0,1)$$. The generated equations look like:

---- xdef  =E=  x

xdef..  - lambda(point1) - 3*lambda(point2) - 6*lambda(point3) - 10*lambda(point4) + x =E= 0 ; (LHS = 5, INFES = 5 ****)

---- ydef  =E=  y

ydef..  - 6*lambda(point1) - 2*lambda(point2) - 8*lambda(point3) - 7*lambda(point4) + y =E= 0 ; (LHS = 0)

link0(b1)..  lambda(point4) - delta(b1) =L= 0 ; (LHS = 0)

link0(b2)..  lambda(point3) + lambda(point4) - delta(b2) =L= 0 ; (LHS = 0)

link1(b1)..  lambda(point1) + lambda(point2) + delta(b1) =L= 1 ; (LHS = 0)

link1(b2)..  lambda(point1) + delta(b2) =L= 1 ; (LHS = 0)

---- sumlambda  =E=  interpolation

sumlambda..  lambda(point1) + lambda(point2) + lambda(point3) + lambda(point4) =E= 1 ; (LHS = 0, INFES = 1 ****)


Of course, a logarithmic model does not make much sense for a model with just 3 segments. We just save 1 binary variable: instead of $$K-1=3$$ binary variables, we have $$\lceil \log_2(K-1)\rceil = 2$$ binary variables.

#### Pyomo

The Pyomo model for this problem can look like:

#
# expected solution X=5, Y=6
#

xdata = [1., 3., 6., 10.]
ydata  = [6.,2.,8.,7.]

from pyomo.core import *

model = ConcreteModel()

model.X = Var(bounds=(1,10))
model.Y = Var(bounds=(0,100))

model.con = Piecewise(model.Y,model.X,
pw_pts=xdata,
pw_constr_type='EQ',
f_rule=ydata,
pw_repn='LOG')

# see what we get for Y when X=5
def con2_rule(model):
return model.X==5

model.con2 = Constraint(rule=con2_rule)

model.obj = Objective(expr=model.Y, sense=maximize)


The output is:

 [    0.00] Setting up Pyomo environment [    0.00] Applying Pyomo preprocessing actions ERROR: Constructing component 'con' from data=None failed: ValueError: 'con'     does not have a list of domain points with length (2^n)+1 [    0.02] Pyomo Finished ERROR: Unexpected exception while loading model:         'con' does not have a list of domain points with length (2^n)+1

Bummer.  You probably have to pad the breakpoints until we have a number of segments that is equal to a power of two. I don't understand this requirement. As we saw before, there really is no good reason for this restriction. As a workaround, we can do this by just repeating the last points as many times as needed:

xdata = [1., 3., 6., 10., 10.]
ydata  = [6.,2.,8.,7.,7.]


#### The DLOG formulation.

The DCC model discussed in [2] looks like:

DCC - Disaggregated Convex Combination Formulation
\begin{align} & \color{DarkRed}x = \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_{s,k} \\ & \color{DarkRed}y = \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_{s,k}\\& \color{DarkRed}\delta_s = \sum_{k|\color{DarkBlue}SK(s,k)} \color{DarkRed} \lambda_{s,k} \\& \sum_s \color{DarkRed}\delta_s=1 \\ & \color{DarkRed}\lambda_{s,k} \ge 0 \\ & \color{DarkRed}\delta_s \in \{0,1\} \end{align}

This model can be adapted to use the logarithmic encoding trick discussed before. We use a similar incidence table as before, but now it refers to $$\lambda_{s,k}$$. I.e.:  $I_{b,v,s,k} = \begin{cases} \text{true} & \text{if breakpoint k is part of a segment s that has binary digit b equal to v}\\ \text{false} & \text{otherwise}\end{cases}$ where $$v \in \{0,1\}$$.

This yields a model like:

DLOG - DCC Formulation with logarithmic number of binary variables.
\begin{align} & \color{DarkRed}x = \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkBlue}{\bar{x}}_k \color{DarkRed}\lambda_{s,k} \\ & \color{DarkRed}y = \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkBlue}{\bar{y}}_k \color{DarkRed}\lambda_{s,k} \\ & \sum_{s,k|\color{DarkBlue}SK(s,k)} \color{DarkRed}\lambda_{s,k} = 1 \\& \sum_{s,k | \text{not } I(b,0,s,k)} \color{DarkRed} \lambda_{s,k} \le \color{DarkRed}\delta_b && \forall b \\& \sum_{s,k | \text{not } I(b,1,s,k)} \color{DarkRed} \lambda_{s,k} \le 1- \color{DarkRed}\delta_b && \forall b \\ & \color{DarkRed}\lambda_{s,k} \ge 0 \\ & \color{DarkRed}\delta_b \in \{0,1\}\\ & k \in \{1,\dots,\color{DarkBlue}K\}\\ & s \in \{1,\dots,\color{DarkBlue}K-1\}\\ & b \in \{1,\dots,\lceil log_2(\color{DarkBlue}K-1) \rceil \} \end{align}

The GAMS model can look like:

 set   k 'breakpoints' /point1*point4/   s 'segments' /segment1*segment3/   sk(s,k) 'mapping'   b 'number of binary variables' /b1,b2/   b01 /0,1/ ; * calculate mapping between breakpoints and segments sk(s,k) = ord(s)=ord(k) or ord(s)=ord(k)-1; display sk; table data(k,*) 'location of breakpoints'            x   y   point1   1   6   point2   3   2   point3   6   8   point4  10   7 ; table gray(s,b) 'Gray encoding of segments'              b1  b2    segment1   0   0    segment2   0   1    segment3   1   1 ; set   incident(b,b01,s,k)   notincident(b,b01,s,k) ; incident(b,'1',sk(s,k)) = gray(s,b); incident(b,'0',sk(s,k)) = 1-gray(s,b); notincident(b,b01,sk) = not incident(b,b01,sk); display incident,notincident; positive variable lambda(s,k) 'interpolation'; binary variable delta(b) 'segments are Gray encoded'; variable x,y; equations    xdef       'x'    ydef       'y'    link0(b)   'link delta-lambda'    link1(b)   'link delta-lambda'    sumlambda  'select segment' ; xdef.. x =e= sum(sk(s,k), lambda(s,k)*data(k,'x')); ydef.. y =e= sum(sk(s,k), lambda(s,k)*data(k,'y')); link0(b)..  sum(notincident(b,'0',s,k),lambda(s,k)) =l= delta(b); link1(b)..  sum(notincident(b,'1',s,k),lambda(s,k)) =l= 1-delta(b); sumlambda.. sum(sk(s,k),lambda(s,k)) =e= 1; x.fx = 5; model m /all/; option optcr=0; solve m maximizing y using mip; display x.l,y.l,delta.l,lambda.l;

The solution looks like:

----     61 VARIABLE x.L                   =        5.000
VARIABLE y.L                   =        6.000

----     61 VARIABLE delta.L  segments are Gray encoded

b2 1.000

----     61 VARIABLE lambda.L  interpolation

point2      point3

segment2       0.333       0.667


This means $$\delta = (0,1)$$ which gives us segment 2.

We can also show the individual equations:

---- xdef  =E=  x

xdef..  - lambda(segment1,point1) - 3*lambda(segment1,point2) - 3*lambda(segment2,point2) - 6*lambda(segment2,point3)

- 6*lambda(segment3,point3) - 10*lambda(segment3,point4) + x =E= 0 ; (LHS = 5, INFES = 5 ****)

---- ydef  =E=  y

ydef..  - 6*lambda(segment1,point1) - 2*lambda(segment1,point2) - 2*lambda(segment2,point2) - 8*lambda(segment2,point3)

- 8*lambda(segment3,point3) - 7*lambda(segment3,point4) + y =E= 0 ; (LHS = 0)

link0(b1)..  lambda(segment3,point3) + lambda(segment3,point4) - delta(b1) =L= 0 ; (LHS = 0)

link0(b2)..  lambda(segment2,point2) + lambda(segment2,point3) + lambda(segment3,point3) + lambda(segment3,point4)

- delta(b2) =L= 0 ; (LHS = 0)

link1(b1)..  lambda(segment1,point1) + lambda(segment1,point2) + lambda(segment2,point2) + lambda(segment2,point3)

+ delta(b1) =L= 1 ; (LHS = 0)

link1(b2)..  lambda(segment1,point1) + lambda(segment1,point2) + delta(b2) =L= 1 ; (LHS = 0)

---- sumlambda  =E=  select segment

sumlambda..  lambda(segment1,point1) + lambda(segment1,point2) + lambda(segment2,point2) + lambda(segment2,point3)

+ lambda(segment3,point3) + lambda(segment3,point4) =E= 1 ; (LHS = 0, INFES = 1 ****)


#### Pyomo

Again when we try Pyomo, we get into trouble:

#
# expected solution X=5, Y=6
#

xdata = [1., 3., 6., 10.]
ydata  = [6.,2.,8.,7.]

from pyomo.core import *

model = ConcreteModel()

model.X = Var(bounds=(1,10))
model.Y = Var(bounds=(0,100))

model.con = Piecewise(model.Y,model.X,
pw_pts=xdata,
pw_constr_type='EQ',
f_rule=ydata,
pw_repn='DLOG')

# see what we get for Y when X=5
def con2_rule(model):
return model.X==5

model.con2 = Constraint(rule=con2_rule)

model.obj = Objective(expr=model.Y, sense=maximize)


yields the following messages:

 [    0.00] Setting up Pyomo environment [    0.00] Applying Pyomo preprocessing actions ERROR: Constructing component 'con' from data=None failed: ValueError: 'con'     does not have a list of domain points with length (2^n)+1 [    0.02] Pyomo Finished ERROR: Unexpected exception while loading model:         'con' does not have a list of domain points with length (2^n)+1

I am not sure what the reason can be for this restriction.

#### Conclusion

This is hopefully a somewhat more accessible description of methods for piecewise linear interpolation with just a logarithmic number of binary variables. This method can be applied to many formulations. Here we just showed two: the Convex Combination (CC) and the Disaggregated Convex Combination (DCC) formulation. Finally. we demonstrated some problems in the Pyomo implementation.

#### References

1. Piecewise linear functions and formulations for interpolation (part 1), http://yetanothermathprogrammingconsultant.blogspot.com/2019/02/piecewise-linear-functions-and.html
2. Piecewise linear functions and formulations for interpolation (part 2), http://yetanothermathprogrammingconsultant.blogspot.com/2019/02/piecewise-linear-functions-and_22.html
3. L. J. Watters, Reduction of integer polynomial programming problems to zero-one linear programming problems, Operations Research 15, 1171–1174,  1967
4. Jaya Singhal, Roy E. Marsten, Thomas L. Morin, Fixed Order Branch-and-Bound Methods for Mixed-Integer Programming: The zoom System, ORSA Journal on Computing, 1989, vol. 1, issue 1, 44-51.
5. Juan Pablo Vielma, Shabbir Ahmed and George Nemhauser, Mixed-Integer Models for Nonseparable Piecewise Linear Optimization: Unifying Framework and Extensions, Operations Research 58(2):303-315, 2010
6. Juan Pablo Vielma and George L. Nemhauser, Modelling Disjunctive Constraints with a Logarithmic Number of Binary Variables and Constraints, Mathematical Programming 128 (2011), pp. 49-72.
7. Gray Code, https://en.wikipedia.org/wiki/Gray_code