Tuesday, July 30, 2019

Advanced R (second edition)


The second edition is out. My version has color pictures (so it must be good!). This book is about the R programming language (not about statistical techniques). Even if you know R really well, you will learn from this book. Hadley Wickham is the author of many extremely useful R packages including ggplot.

Contents


  1. Introduction

    I Foundation
  2. Names and values
  3. Vectors
  4. Subsetting
  5. Control flow
  6. Functions
  7. Environments
  8. Conditions

    II Functional Programming
  9. Functionals
  10. Function factories
  11. Function operators

    III Object-oriented Programming
  12. Base types
  13. S3
  14. R6
  15. S4
  16. Trade-offs

    IV Metaprogramming
  17. Big picture
  18. Expressions
  19. Quasiquotation
  20. Evaluation
  21. Translating R code

    V Techniques
  22. Debugging
  23. Measuring performance
  24. Improving performance
  25. Rewriting R code in C++

References


  1. https://adv-r.hadley.nz/  Online version of book
  2. https://github.com/hadley/adv-r The source code for the book. The book is written in bookdown.
  3. https://bookdown.org/

Tuesday, July 16, 2019

Quadratic Indicator Constraint?

I was pondering about a constraint of the form:\[x_{i,k}=1 \Rightarrow d_{i,k} \ge \sum_j \left( \bar{x}_{k,j} - p_{i,j} \right)^2\] where \(x_{i,k}\in \{0,1\}\). Obviously, this could be implemented easily using indicator constraints, which are supported by all major solvers. These constraints have the form: \[\begin{align} &x_i = 0 \Rightarrow \text{constraint} \\ &\text{or} \\ & x_i = 1 \Rightarrow \text{constraint}\end{align}\] where \(x_i \in \{0,1\}\).

Unfortunately, indicator constraints are only supported for linear constraints. The Cplex docs say:

  • The constraint must be linear; a quadratic constraint is not allowed to have an indicator constraint.
  • A lazy constraint cannot have an indicator constraint.
  • A user-defined cut cannot have an indicator constraint.

Gurobi mentions:

An indicator constraint \(y=f  \rightarrow a^Tx \le b\) states that if the binary variable \(y\) has the value \(f\in \{0,1\}\) in a given solution then the linear constraint \(a^Tb \le b\) has to be satisfied.

Xpress and SCIP say similar things: indicator constraints are only for linear constraints.

Somehow, I expected that I could use a (convex) quadratic constraint with an indicator. As none of the solvers has this implemented, I suspect there is a good reason why this is not possible.

Big-M and SOS1


So, now we cannot use indicators, we need to use more traditional implementation techniques. The most obvious is to use a big-\(M\) formulation: \[  d_{i,k} \ge \sum_j \left( \bar{x}_{k,j} - p_{i,j} \right)^2 - M(1-x_{i,k})\] The disadvantage of this formulation is that we need to establish a good value for \(M\) and that this value cannot be too large. Large values of \(M\) tend to confuse MIP solvers.

If you have no good (and small) values for \(M\), we can use SOS1 sets. I.e. we write \[\begin{align} &d_{i,k} \ge \sum_j \left( \bar{x}_{k,j} - p_{i,j} \right)^2 - s_{i,k} \\ & s_{i,k}\ge 0\\ & \{s_{i,k},x_{i,k}\} \in SOS1\end{align}\] This says: either \(s_{i,k}\) or \(x_{i,k}\) can be non-zero, but not both. To be precise: both can be zero. Most MIP solvers support SOS1 sets.

Finally, as stated in the comments, an alternative formulation would be \[\begin{align} & d'_{i,k} \ge \sum_j \left( \bar{x}_{k,j} - p_{i,j} \right)^2 \\ & x_{i,k}=1 \Rightarrow d_{i,k} \ge  d'_{i,k} \end{align}\] This comes at the expense of extra variables and constraints. Basically we added one indirection. With this formulation, we have a linear indicator constraint and a separate convex quadratic inequality. I assume the presolver will not presolve this away.

Error Messages


In my opinion well-formulated error messages are very important. As a user, we are already confused that the system is not doing what we want, so a confusing error message is just piling on. Let's see what we see when using a quadratic indicator constraint:

  • GAMS has poor support for indicators in general, but is doing really poorly here. It does not give an error message, but just gives a wrong solution. This is really bad (I reported this, so it will be fixed).
  • AMPL gives an error message that seems to come from outer space.
    CPLEX 12.9.0.0: logical constraint _slogcon[1] is not an indicator constraint.Let's blame the slogcons (???). Obviously, I have no constraint (or any other symbol) called slogcon.
  • Cplex's interactive optimizer is doing a better job:
    CPLEX Error  1605: Line 5: Illegal quadratic term in a constraint.
  • Cplex's OPL has a difficult time understanding the (convex) quadratic indicator constraint. It says:
    *** FATAL[ENGINE_002]: Exception from IBM ILOG CPLEX: CPLEX Error  5002: 'q1' is not convex. Obviously the constraint is convex, so there is a problem in how OPL generates the constraint (i.e. this is a bug). The error message just does not make sense. The message is also bad: I don't have a `q1` in the model. 

A better message would be something like:
Equation `e2`: quadratic terms in an indicator constraint are not supported. Please reformulate this constraint.

Basically you want an error message to describe the problem in a way that is understandable (even when the user is slightly panicked). And secondly, it is a good idea to tell the user what to do now. In this case you need to reformulate (this tells the user there is no need to search for some Cplex option that can help here). And, never, never mention the slogcons.

Of course, it would be even better if solvers would support quadratic indicator constraints. In design there is this concept of orthogonality: it is best to have as few exceptions as possible.

Sunday, July 7, 2019

Binary multiplication

We want to express \[\begin{align} & z = x \cdot y \\ & x,y,z \in \{0,1\}\end{align}\] in a linear fashion. There are two good reasons to do this. This linearization will allow us to use linear Mixed Integer Programming solvers instead of quadratic or MINLP solvers. In addition, the linearization can bypass any non-convexity in the original quadratic formulation. The linear formulation will buy us more accessible solvers, and possibly better performance.

As we are talking about binary variables, \(z = x \cdot y\) can be interpreted as \(z = x \textbf{ and } y\). We can resort to two standard formulations:

Formulation 1Formulation 2
\[\begin{align} & \color{darkred} z \le \color{darkred}x\\ & \color{darkred}z \le \color{darkred}y\\ & \color{darkred}z \ge \color{darkred}x+\color{darkred}y-1\\ & \color{darkred}x,\color{darkred}y \in \{0,1\} \\ & \color{darkred}z \in [0,1] \end{align}\] \[\begin{align} & \color{darkred} z \le \frac{\color{darkred}x+\color{darkred}y}{2}\\ & \color{darkred} z \ge \color{darkred}x+\color{darkred}y-1\\ & \color{darkred}x,\color{darkred}y,\color{darkred}z \in \{0,1\} \\ \end{align}\]

I always use the first form. But I quite often see the second formulation being used (e.g. [1]).

The second form can be interpreted as an aggregation of the first one by adding up \(z\le x\) and \(z \le y\).

There is a good reason to use the first form: it is tighter. For instance, the fractional value \((x,y) = (0, 1/2)\) yields \(z=0\) in the first formulation, but would allow \(z=0.25\) in the second formulation. The first formulation is so tight, we can relax \(z\in\{0,1\}\) to \(z\in[0,1]\).

My attempt to show graphically the difference:

Formulation 1
Formulation 2


Obvious formulation 1 seems to be the winner. But when I try this out on a model [2] I see:

Results for Cplex, default settings, 1 thread
SizeModel 1
Seconds/nodes
Model 2
Seconds/nodes
\(n=20,k=10\)2.9 / 31171.8 / 3018
\(n=25,k=5\)12 / 624511 / 7043
\(n=30,k=4\)15 / 526313 / 4471

Well, this does not seem to prove my point. My preferred formulation is actually slightly underperforming.

Let's look at a different solver to see if that proves my point.


Results for CBC, 1 thread
SizeModel 1
Seconds/nodes
Model 2
Seconds/nodes
\(n=20,k=10\)46 / 30821623 / 279364
\(n=25,k=5\)105 / 68601823 / 87396
\(n=30,k=4\)144 / 12008213 / 11336

These results look quite different. First, obviously CBC is slower than Cplex. No surprise there. But also we see formulation 1 is much better than formulation 2. This is more like I expected beforehand. I suspect the cuts produced by Cplex eliminated the advantage of formulation 1. We see that more often: some modeling tricks are becoming less important as solvers are getting smarter.

Notes:

  1. The meaning of \(n\) and \(k\) for the size of the problem is explained in [2].
  2. The variables \(z\) were declared as binary both in models 1 and 2.


References