Tuesday, May 28, 2019

Bad question?

In [1] a user asks:

I need to build a MILP (Mixed integer linear programming) constraint form this if-else statement: with beta is a constant.
if (a > b) then c = beta else c = 0
How can I build the statement to MILP constraint. Are there any techniques for solving this problem. Thank you.

In my opinion this is a difficult question to answer. To put it bluntly: this is a poor question because we lack lots of information.

  1. We are not sure which identifiers are decision variables and which ones are parameters. It is stated that beta is a constant, so probably we can deduce that all other identifiers refer to variables. In general, it is important to state these things explicitly.
  2. For the variables: we don't know the types (binary, integer, positive, free).
  3. We don't see the rest of the model. In its most general form, this if-construct is a somewhat difficult thing to handle. In practice however, we can almost always exploit knowledge from the model to simplify things considerably. I almost never encounter models where we have to handle this construct in its most general form. 
  4. Expanding on the previous point: one thing to look at is how the objective behaves. It may push variables in a certain direction, which we may exploit. Often this leads to substantial simplifications. In many cases we can drop an implication (e.g. drop the else part).
  5. We don't know what solver or modeling system is being used. Some tools have good support for implications (a.k.a. indicator constraints), which can make things much easier. In other cases, we may need to use big-M constraints.
  6. If  \(a\), \(b\) and \(c\) are (free) continuous variables, the condition \(\gt\) is a bit ambiguous. Do we really mean \(\gt\) or can we use \(\ge\)? In my models, I tend to use \(\ge\) and make \(a=b\) ambiguous: we can pick the best decision for this case (I don't want to skip a more profitable solution because of some epsilon thing). 
  7. Using assumptions in the previous point, we can write \[\begin{align} &\delta = 1 \Rightarrow a \ge b\\ & \delta=0 \Rightarrow a \le b\\ & c = \delta \cdot \beta\\ & \delta \in \{0,1\}\end{align}\] Note that the \(c\) constraint is linear: \(\beta\) is a parameter.
  8. This can be translated into  \[\begin{align} & a \ge b - M(1-\delta)\\ & a \le b + M\delta \\ & c = \delta \cdot \beta\\ & \delta \in \{0,1\}\end{align}\] To determine good values for big-\(M\) constants, again, we need to know more about the model.
  9. If you insist on \(a\gt b\), we can introduce some tolerance \(\varepsilon>0\) and write: \[\begin{align} &\delta = 1 \Rightarrow a \ge b + \varepsilon \\ & \delta=0 \Rightarrow a \le b\\ & c = \delta \cdot \beta\\ & \delta \in \{0,1\}\end{align}\] Here \(\varepsilon\) should be larger than the feasibility tolerance of the solver (scaling may make this not completely obvious). Note that we effectively create a "forbidden region". The variable \(a\) can not assume any value in the interval \((b,b+\varepsilon)\) (again, subject to feasibility tolerances).
  10. Of course when we have integer variables \(a \gt b\) is much more well-defined and we can interpret that as \(a \ge b+1\).

So the best answer is: I would need to look at the whole model to give a good answer.

These type of questions are quite common, and answering them is just very difficult. You can not expect a reply that enumerates all possible answers for all possible cases.

References


  1. Build MILP constraint from if-else statements, https://stackoverflow.com/questions/55899166/build-milp-constraint-from-if-else-statements

Bad naming?


Example C++ code

The above fragment is from [1]. I never write loops like this. I use \(n\) for limits or counts, but never for a loop index.

Looking at this, I realized I have many of these "rules". Such as:

  1. \(x\), \(y\), \(z\), and \(v\), \(w\) are always double precision variables. (I used to subtract points if a student would write
        for (int x=0; ... ).
  2. \(i\), \(j\), \(k\) and \(m\), \(n\) are always integer variables.
  3. Never use \(l\) (i.e. \(\ell\)) as variable name, it is too close to the digit 1 (one).
  4. Don't use short integers (unless for a specific reason) or single precision variables.
  5. Use \(i\),\(j\),\(k\) as loop indices in a predictable way (e.g. for a (sparse) matrix: \(i\) for rows, \(j\) for columns, \(k\) for things like nonzero elements).
  6. The previous rule also applies to AMPL which uses local index names. E.g. after declaring
       param f_max {j in FOOD} >= f_min[j]; 
    I always use j for FOOD
  7. Use short names for items (variables) that are often used and for locally declared indices. Use long names for items that are sparsely used. I call this Huffman-code [2] naming. 
I am so used to this, that code that disobeys this in a flagrant way, just hurts my eyes. I find that, if I follow these simple rules, reading code is easier. It minimizes the surprise factor. Of course, writing code is for consumption by a compiler (or other tool), but more importantly: for consumption by a human reader.

So, that loop should look like:

const int n = 10;
for (int i = 0; i < n; ++i) {...}

References

  1. https://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution
  2. https://en.wikipedia.org/wiki/Huffman_coding

Tuesday, May 14, 2019

Ordering of variables / constraints in an LP/MIP

In [1] a user asks about how to force a certain ordering of variables for an LP/MIP model. This is an intriguing question for me because I really never worry about this.

  • The user asks only about variable ordering. Of course there is a symmetry here: the same question can be asked for equations. Well, equations correspond to slack variables, so this is essentially the same.
  • A different ordering may (will?) cause a different solution path, so you can expect different solution times and iteration counts. Of course, this is in a very unpredictable way, so not something we can exploit easily. Especially in MIPs we have this concept of performance variability [2]: minor changes in the model can cause significant different solution times.
  • Why would you really worry about the ordering? There are three cases I can think of: (1) access variables by index number [I tend to use higher-level tools where this is not needed], (2) when implementing some decomposition scheme [same: in higher-level tools we can index by names], or (3) when plotting the non-zero elements in the A matrix [I never do this; I find this is not adding much insight into the model].  In practice, I never worry about the ordering. We have already many things to worry about when building LP/MIP models; this should not be one of them. 

I don't understand why one would want to spend time on this.

References

  1. https://stackoverflow.com/questions/56122889/how-to-control-ordering-of-matlab-optimproblem-variables
  2. Andrea Lodi, Andrea Tramontani. Performance Variability in Mixed-Integer Programming. In INFORMS TutORials in Operations Research. Published online: 14 Oct 2014; 1-12, https://pubsonline.informs.org/doi/abs/10.1287/educ.2013.0112

Cutting Application

Version 1 of cutting application.



The algorithms are a small part of the whole application.  Some of the issues:


  1. Users are not always able to "specify" in detail what they want in advance. Building prototypes can help: giving feedback about a demo is easier than writing detailed specs in advance.
  2. Most code is not dedicated to the algorithm itself, but rather to surrounding things. I estimate that the algorithms cover about 20% of the code. For instance, reporting was a substantial effort here.

In the application, I draw on a canvas, but we can export to Excel:

Excel version of output

or print to a PDF file:

PDF version of output


Anaconda Python Install

The installation of anaconda 64 bit on my windows laptop worked fine. But then all conda and pip commands came back with the fatal error:

Can't connect to HTTPS URL because the SSL module is not available.

I found lots of messages about this. This bug seems to bug users for a long time. Even though some of these reports are closed as "Resolved" by the developers, I encountered this problem just today. Here was my solution (from [1]):



I.e. two DLLs were in the wrong directory.


Immediately after this I noticed a second problem:



A third problem is that conda install sometimes takes forever.

All these problems are not unique to me. Using google, I see lots of other users having the same or similar problems.

I had hoped that this would be a bit less painful. This was a standard install on a standard windows machine. This really should not give all these problems.

References


  1. https://github.com/conda/conda/issues/8273