Thursday, December 24, 2020

Numbrix puzzle via TSP

 In [1], I demonstrated a small and rather elegant MIP model to solve Numbrix puzzles. Here is the comment by Austin Buchanan:

It would also be interesting to see how this compares to an MTZ-like model. And how well the two models scale to larger grids.

The Miller-Tucker-Zemlin [2,3] formulation for the Traveling Salesman Problem (TSP) starts with a standard assignment problem, and then adds some sub-tour elimination constraints. From [3]:

Note:  here \(\color{darkblue}p\ge \color{darkblue}n\). This 1960 paper has some interesting computational tests. Many runs ended with: "the machine was stopped after over 250 pivot steps had failed to produce the solution."

Tuesday, December 22, 2020

Numbrix puzzle via MIP

The puzzle of numbrix [1,2] is simple to state.

  • We have a 9x9 board with 81 cells. 
  • We need to fill each cell with a unique integer value between 1 and 81.
  • There are some prefilled cells.
  • We should be able to start with the cell containing value 1 and then following a path (only going one up, down, left or right) to the next values and thus eventually arriving at the cell with value 81.  


Below is an example of an input (the "given" cells) and a solution. The colors may help to see that we have a path from the cell with value 1 to the cell with value 81.

Of course, we want to see if we can create a MIP model for this.

Monday, December 21, 2020

GAMS + R Markdown scripting

In [1] I discussed a small script that exported GAMS data to R for plotting. Here I expand on that and show how we can use R Markdown [2] to generate documents that contain GAMS data and results. Here we have just a small example with just a parameter. But it is easy to see that this can also be used with larger models, where we have a document built around more complex model results. 

R Markdown can handle as input text, R code, R graphics and LaTeX math (among others). Output can be HTML, PDF (via LaTeX), and DOCX (MS Word).

Saturday, December 19, 2020

GAMS + R scripting

 I like to automate things. That means in practice: I write scripts. I prefer to spend a little more time to develop a script, basically for two reasons:

  • A script can be executed again with little or no effort.
  • A script serves as documentation. 
This helps to make things reproducible.

Here is a setup I often use for post-processing GAMS results with R. 

Tuesday, December 15, 2020

Modeling a simple implication

From [1]:

How to model:  

if \(\color{darkred}\delta=0\) then \(\color{darkred}x=0\) else \(\color{darkred}x=\color{darkred}x\)

where \(\color{darkred}\delta\in \{0,1\}\) is a binary variable and \(\color{darkred}x\ge 0\) is a non-negative continuous variable. 

The else part looks strange, but it is meant to indicate: unrestricted. So better is to just drop the else part:

if \(\color{darkred}\delta=0\) then \(\color{darkred}x=0\)

Let's list some possible formulations. 

Friday, December 11, 2020

Electronic Amoeba for solving TSPs?


Combinatorial optimization to search for the best solution across a vast number of legal candidates requires the development of a domain-specific computing architecture that can exploit the computational power of physical processes, as conventional general-purpose computers are not powerful enough. Recently, Ising machines that execute quantum annealing or related mechanisms for rapid search have attracted attention. These machines, however, are hard to map application problems into their architecture, and often converge even at an illegal candidate. Here, we demonstrate an analogue electronic computing system for solving the travelling salesman problem, which mimics efficient foraging behaviour of an amoeboid organism by the spontaneous dynamics of an electric current in its core and enables a high problem-mapping flexibility and resilience using a resistance crossbar circuit. The system has high application potential, as it can determine a high-quality legal solution in a time that grows proportionally to the problem size without suffering from the weaknesses of Ising machines.

Picture of the electronic amoeba (from [2])

"Electronic amoeba" sounds awful. Furthermore, the paper talks about "legal solutions". Now the lawyers even get involved... 


  1. 'Electronic amoeba' finds approximate solution to traveling salesman problem in linear time,
  2. Kenta Saito, Masashi Aono and Seiya Kasai, Amoeba-inspired analog electronic computing system integrating resistance crossbar for solving the travelling salesman problem,


Monday, December 7, 2020

Directed vs undirected networks in math programming models

Many practical optimization models contain some network structures. In virtually all cases, these networks are modeled as directed graphs. I.e., between nodes \(i\) and \(j\), we consider arcs \(i \rightarrow j\) and \(j \rightarrow i\). In graph theory, it is very customary to talk about undirected graphs. In this case, there is just one link between nodes \(i\) and \(j\). Why am I using directed graphs, with two uni-directional arcs for each pair of nodes? I will try to answer this in this post.

As an example, let's consider the shortest path problem. This is really the same as a min-cost problem. I took the following data (from [1]):

Monday, November 30, 2020

QuadProg: usage issues

The QuadProg QP (Quadratic Programming) solver, based on an algorithm by Goldfarb and Idnani [1,2], is quite popular. It is available for R [3] and Python [4]. However, it has some real drawbacks. Some of it is just "user interface" and documentation: how to call the solver.

The mathematical model the solver works with is [3]: \[\begin{align} \min\>& -\color{darkblue}d^T\color{darkred}b + \frac{1}{2}\color{darkred}b^T\color{darkblue}D\color{darkred}b \\ & \color{darkblue}A^T\color{darkred}b \begin{bmatrix}= \\ \ge \end{bmatrix} \color{darkblue}b_0 \end{align} \]

This immediately poses some questions:

  • Variables are typically called \(\color{darkred}x\), and \(\color{darkblue}b\) is usually a constant. Why the different notation here?
    I don't know. Maybe to make sure you pay attention.
  • How to handle bounds?
    The variable type the solver supports is called "free" (i.e. unrestricted). This is different than for many LP solvers, where the default variable type is positive. For non-negative variables, you need to explicitly add the constraints \(\color{darkred}b \ge 0\). Similar for other bounds.
  • Why the different signs for the linear part and the quadratic part in the objective?
    I don't know.
  • Why the constant \(1/2\) in the objective?
    We see this more often in QP solvers. This is because the gradient of the quadratic term \(\color{darkred}x^T\color{darkblue}Q\color{darkred}x\) is \(2\color{darkblue}Q\color{darkred}x\). The algorithm prefers to work with \(\color{darkblue}Q\color{darkred}x\) however. So in order to stay really close to the algorithm, the user is asked to multiply all quadratic coefficients \(\color{darkblue}D_{i,j}\) by 2.
  • Most QP solvers call the matrix with quadratic coefficients \(\color{darkblue}Q\). Why is it called \(\color{darkblue}D\) here?
    Don't know.
  • Does \(\color{darkblue}D\) have to be symmetric?  
    The documentation does not tell, but the answer is: only the diagonal and upper-triangular part of \(\color{darkblue}D\) are used. This means the matrix is assumed to be symmetric. E.g. when passing \[\color{darkblue}D = \begin{pmatrix} 1 & 0 \\ 100 & 1 \end{pmatrix}\] the solver thinks it received \[\color{darkblue}D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\]
  • Can \(\color{darkblue}D\) be positive semi-definite?
    In QuadProg, \(\color{darkblue}D\) has to be strictly positive definite.
    Defnition: A matrix \(Q\) is called positive semi-definite when \(x^TQx\ge 0\) for all \(x\). It is called positive definite when \(x^TQx\gt 0\) for all \(x \ne 0\). A positive definite matrix has strictly positive eigenvalues. I.e., the smallest eigenvalue of \(\color{darkblue}D\) must be \(\lambda_{min}\gt 0\). For a positive semi-definite matrix all eigenvalues are non-negative (i.e., zero is allowed).
    The fact that QuadProg requires a strict positive definite matrix, means that all variables need to be quadratic and also that this algorithm can not solve LPs. Most QP solvers (in fact all the ones I know beyond QuadProg) only require positive semi-definite quadratic coefficient matrices. I.e., for linear variables we can use coefficients \(\color{darkblue}q_{i,j}=0\) (where \(i\) or \(j\) corresponds to a linear variable). Quadprog is stricter. The whole matrix \(\color{darkblue}D\) most be strictly positive definite. You should be aware of these restrictions. And the documentation [3] should have mentioned this.
  • Can I trick QuadProg to accept linear variables?
    Yes, suppose we have 2 quadratic variables and 2 linear variables. Then form the \(\color{darkblue}D\) matrix as \[\color{darkblue}D = \begin{pmatrix} \color{darkblue}d_{1,1} &\color{darkblue}d_{1,2} & 0 & 0  \\ \color{darkblue}d_{2,1} &\color{darkblue}d_{2,2} & 0 & 0  \\ 0 & 0 &\color{darkblue} \varepsilon & 0 \\ 0 & 0 & 0 &\color{darkblue} \varepsilon \end{pmatrix}\] where \(\color{darkblue} \varepsilon\gt 0\) is a small constant. This trick would also allow us to feed the algorithm linear problems.
  • Usually, solvers express linear constraints as \(\color{darkblue}A\color{darkred}x=\color{darkblue}b\). Why do we have a transposed matrix \(\color{darkblue}A^T\) here?
    I suspect this is closer to what the dual algorithm prefers. Again, making the input closer to what the solver wants.
  • How to input \(\le\) constraints?
    Multiply these by \(-1\) and make them \(\ge\) constraints.

These are some of the questions I have seen in the past about QuadProg.  Hopefully, this at least demystifies some of them.

As an alternative to QuadProg, you may want to have a look at CVXR or CVXPY for modeling QPs. These tools can talk to different solvers and will apply the necessary transformations to make the model suited for the selected solver. The modeling is also a bit more natural than the matrix input for QuadProg. For non-trivial quadratic programming models, QuadProg may not be the best choice.  


  1. D. Goldfarb and A. Idnani (1982). Dual and Primal-Dual Methods for Solving Strictly Convex Quadratic Programs. In J. P. Hennart (ed.), Numerical Analysis, Springer-Verlag, Berlin, pages 226–239.
  2. D. Goldfarb and A. Idnani (1983). A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming, 27, 1–33.
  3. R package quadprog,
  4. Python package quadprog,

Monday, November 23, 2020

Machine scheduling: Maintenance and maximum runlength

In [1] a problem is posed that deals with a maximum run-length (6 days) before the machine needs to be taken off-line for maintenance (1 day).  Consider the binary decision variable:

\[\color{darkred}y_t = \begin{cases} 1 & \text{if machine is online at period $t$} \\ 0 & \text{otherwise} \end{cases} \]

In [1] a scheme is proposed as follows. Introduce an integer variable \(\color{darkred}z_t\) that has the cumulative days the machine is running. Then apply the rule: 

if \(\color{darkred}z\gt 6\) then \(\color{darkred}y:=0,\color{darkred}z:=0\)

This approach needs a bit of repair to make it work for a MIP model. A MIP model can not contain assignments, and decision variables are part of a system of linear equations. I.e., all decision variables are essentially calculated at the same time (a notion which can be foreign for computer programmers who think in sequential assignment statements). We can write:\[\begin{align} & \color{darkred}y_t = 1 \Rightarrow \color{darkred}z_t  = \color{darkred}z_{t-1} + 1 \\ & \color{darkred}y_t = 0 \Rightarrow \color{darkred}z_t  = 0 \\ & \color{darkred}z_t \in \{0,\dots,6\}\end{align}\] This is in the form of implications that can be used directly with (advanced) solvers that support indicator constraints (i.e., a binary variable on the left and a linear constraint on the right). Note that we don't do anything special when \(\color{darkred}z_t\) becomes 7. We just forbid that from happening by setting an upper bound of 6. The initial value \(\color{darkblue}z_0\) is the number of periods the machine is already online at the beginning of the planning period. 

If your solver or modeling tool does not have support for indicator constraints, we can rewrite things as: \[\begin{align}& \color{darkblue}z_t^{up} := 6  \\ & \color{darkred}z_t \le \color{darkred} y_t \cdot \color{darkblue}z_t^{up}\\  & \color{darkred}z_t \le   \color{darkred}z_{t-1} + 1 \\ & \color{darkred}z_t \ge   \color{darkred}z_{t-1} + 1 - (1-\color{darkred}y_t ) \cdot (\color{darkblue}z_t^{up}+1)\\  & \color{darkred}z_t \in \{0,\dots,\color{darkblue}z_t^{up} \}\end{align}\] Here \( \color{darkblue}z_t^{up} =6\) is the upper bound on \( \color{darkred}z_t\). This formulation requires a little bit of thinking. Finally, I want to mention that \( \color{darkred}z_t\) can be declared as a continuous variable: it will be integer automatically. 

A simpler alternative formulation is to just forbid any sequence of 7 consecutive 1's for \(\color{darkred}y\). I.e. \[\color{darkred}y_t+\color{darkred}y_{t+1}+\color{darkred}y_{t+2}+\color{darkred}y_{t+3}+\color{darkred}y_{t+4}+\color{darkred}y_{t+5}+\color{darkred}y_{t+6} \le 6\] It may be easier to look backward: \[\color{darkred}y_t+\color{darkred}y_{t-1}+\color{darkred}y_{t-2}+\color{darkred}y_{t-3}+\color{darkred}y_{t-4}+\color{darkred}y_{t-5}+\color{darkred}y_{t-6} \le 6\] The terms \(\color{darkred}y_{t-k}\) can have a non-positive index \(t-k\) which means this is historic data, from before our planning period. In other words \(\color{darkblue}y_0,\color{darkblue}y_{-1},\color{darkblue}y_{-2},\dots\) is data.


This is a good example where we can achieve the same result using two very different approaches. It may make sense to choose the simplest formulation.


  1. Linear Programming - Re-setting a variable based on it's cumulative count,

Friday, November 20, 2020

OR Scientist Job @ Amobee

The Operations Research Scientist is responsible for collaborating with the team to design and implement systems that focus on forecasting, allocating, delivering, and reporting of advertising on the internet, TV, and connected devices.  The successful candidate will utilize advanced mathematical, optimization, machine-learning, software engineering, and programming skills to solve challenging problems.


  • Develop innovative mathematical programming (LP, MILP, …) solutions to run efficiently and at scale
  • Develop innovative forecasts, models, algorithms, and solutions and build prototypes and production implementations
  • Work cross-functionally to gather feedback and requirements from end users
  • Be a key contributor to product innovation and company strategy
  • Communicate technical information clearly to both technical and non-technical people


  • Degree in Operations Research, Mathematics, Statistics, Industrial Engineering, Mechanical Engineering, Physics, or related field;  MS/PhD strongly preferred
  • 3+ years of relevant experience
  • Strong experience with implementing decomposition techniques, column generation preferred.
  • Strong analytical, statistical, and math abilities; Exceptional problem-solving skills
  • Excellent communication skills
  • Extreme attention to detail
  • Ability to work as part of a team in a fast-paced environment
  • Familiarity with GAMS, Gurobi/Cplex, CVX, Python, R, Java, Scala, SQL

Friday, November 6, 2020

Gurobi 9.1 Performance Improvement and New Features.

Some really good performance improvements. Striking are the numbers for non-convex MIQCP models. 

Performance Improvements

With Gurobi 9.1, the Gurobi Optimizer registered notable performance improvements across multiple problem types including:

  • Primal simplex: 17% faster overall, 37% faster on models that take at least 100 seconds.
  • Dual simplex: 29% faster overall, 66% faster on models that take at least 100 seconds.
  • Barrier: 15% faster overall, 34% faster on models that take at least 100 seconds.
  • Mixed-integer linear programming (MILP): 5% faster overall, 9% faster on models that take at least 100 seconds.
  • Convex mixed-integer quadratic programming (MIQP): 5% faster overall, 20% faster on models that take at least 100 seconds.
  • Convex mixed-integer quadratically constrained programming (MIQCP): 13% faster overall, 57% faster on models that take at least 100 seconds.
  • Non-convex mixed-integer quadratically constrained programming (non-convex MIQCP): 4x faster overall, 9x faster on models that take at least 100 seconds.
  • Irreducible Infeasible Subset (IIS) computation: 2.6x faster overall, 5.7x faster on models that take at least 100 seconds.
  • Better MIP feasible solutions: Heuristics are significantly better at finding high-quality solutions earlier.

New Features

The new features in the release include:

    • NoRel Heuristic: This new heuristic finds high-quality solutions in situations where the linear programming (LP) relaxation of the mixed-integer programming (MIP) problem is too expensive to solve.
    • Integrality Focus: This new feature allows users to be much stricter on integrality constraints, thus avoiding many undesirable results (including trickle flows) that can come from small integrality violations.
    • Python Matrix API Enhancements: Gurobi’s Python interface – gurobipy – has been extended and improved to better support matrix-oriented modeling.
    • Pip Install Support: Users can now utilize pip, a Python tool, to install Gurobi in their Python environment.


    Optimal (Q,R) Inventory Policy as a MIP

    In [1] the question was posed: "Can we formulate a (Q,R) inventory model as an integer programming model?" Of course, I said: Yes. But we need to be a bit careful if we want to make it a linear model. Let's check if I was just bluffing here...

    (Q,R) Inventory Policy

    The (Q,R) inventory model is as follows: if the inventory falls below R, place an order for Q. In addition, we deal with:
    1. Lead times: when we placed an order it takes some time to be delivered
    2. When inventory becomes zero, additional demand will cause a backlog. Backlogged demand will be fulfilled when replenishments arrive but at a cost.
    3. Costs are:
      1. Fixed ordering cost
      2. Holding cost related to inventory
      3. Penalties related to backlogs


    We follow [1]  here. Demand is drawn randomly from a discrete uniform distribution between 0 and 99.

    ----     11 PARAMETER demand  random data
    t1   17,    t2   84,    t3   55,    t4   30,    t5   29,    t6   22,    t7   34,    t8   85,    t9    6,    t10  50
    t11  99,    t12  57,    t13  99,    t14  76,    t15  13,    t16  63,    t17  15,    t18  25,    t19  66,    t20  43
    t21  35,    t22  35,    t23  13,    t24  15,    t25  58,    t26  83,    t27  23,    t28  66,    t29  77,    t30  30
    t31  11,    t32  50,    t33  16,    t34  87,    t35  26,    t36  28,    t37  59,    t38  72,    t39  62,    t40  46
    t41  41,    t42  11,    t43  31,    t44   4,    t45  33,    t46  18,    t47  64,    t48  56,    t49  76,    t50  29
    t51  66,    t52  75,    t53  62,    t54  28,    t55   8,    t56  10,    t57  64,    t58  54,    t59   3,    t60  79
    t61   7,    t62  17,    t63  52,    t64  75,    t65  17,    t66   3,    t67  58,    t68  62,    t69  38,    t70  35
    t71  24,    t72  24,    t73  13,    t74  93,    t75  37,    t76  78,    t77  30,    t78  12,    t79  74,    t80   6
    t81  20,    t83  26,    t84  49,    t85  15,    t86  17,    t87  33,    t88  31,    t89  32,    t90  96,    t91  99
    t92  36,    t93  37,    t94  77,    t95  39,    t96  91,    t97  11,    t98  73,    t99   5,    t100 57

    We assume demand occurs during period \(t\), i.e. it is a "flow" quantity (opposed to "stock variables"). Note that demand for time period 82 is zero (this is not printed here).

    The other data for the problem looks like:

    ----     28 PARAMETER invCap               =     1000.000  inventory capacity
                PARAMETER maxQ                 =     1000.000  max order quantity
                PARAMETER orderCost            =      500.000  cost for each order
                PARAMETER holdCost             =        2.000  holding cost (per unit, per time period)
                PARAMETER leadTime             =       10.000  lead time
                PARAMETER backLogCost          =       10.000  backlogged orders penalty (per unit, per time period)
                PARAMETER maxBackLogged        =     1000.000  big-M
    ----     28 PARAMETER initInv   intial inventory available at t1
    t1 500.000


    I need a bunch of decision variables to model this. Let's list them:

    Decision variables
    \(\color{darkred}Q \in [0,\color{darkblue}{\mathit{maxQ}}]\)Reorder quantity. Each time we place an order, this is the size of the order.
    \(\color{darkred}R \in [0,\color{darkblue}{\mathit{invCap}}]\)Reorder point. We place an order when the inventory drops below \(\color{darkred}R\).
    \(\color{darkred}{\mathit{inv}}_t \in [0,\color{darkblue}{\mathit{invCap}}] \)Inventory at end of period \(t\).
    \(\color{darkred}{\mathit{back}}_t \in [0,\color{darkblue}{\mathit{maxBackLogged}}]\)Backlog, unmet demand we fulfill once new inventory arrives, again at end of period \(t\).
    \(\color{darkred}{\mathit{low}}_t \in \{0,1\}\)Inventory is low, defined by \(\color{darkred}{\mathit{inv}}_t \le \color{darkred}R\).
    \(\color{darkred}{\mathit{order}}_t \in \{0,1\}\)An order is placed. We place an order when \(\color{darkred}{\mathit{inv}}_t\) drops below \(\color{darkred}R\).
    \(\color{darkred}{\mathit{repl}}_t \in \{0,\color{darkred}Q\}\)Replenishment. Inventory is replenished by a previously placed order (there is a lead time). This replenishment is also used to handle backlogged demand. \(\color{darkred}{\mathit{repl}}_t\) is either zero or equal to the reorder quantity \(\color{darkred}Q\).

    In the question, it was required that the decision variables are integer-valued. So, I declared variables as integer where needed.

    Model equations

    1. The objective has three parts: fixed ordering cost, holding cost, and backlog penalities. So we have \[\min \> \color{darkred} z = \color{darkblue}{\mathit{orderCost}}\cdot\sum_t \color{darkred}{\mathit{order}}_t+\color{darkblue}{\mathit{holdCost}}\cdot \sum_t \color{darkred}{\mathit{inv}}_t+\color{darkblue}{\mathit{backlogCost}} \cdot\sum_t \color{darkred}{\mathit{back}}_t\]

    2. The inventory balance equation is a bit complex as we allow backlog: \[\color{darkred}{\mathit{inv}}_t-\color{darkred}{\mathit{back}}_t = \color{darkred}{\mathit{inv}}_{t-1}-\color{darkred}{\mathit{back}}_{t-1} - \color{darkblue}{\mathit{demand}}_t + \color{darkred}{\mathit{repl}}_t + \color{darkblue}{\mathit{initInv}}_t \] Instead of just inventory, we update the quantity \(\color{darkred}{\mathit{inv}}_t-\color{darkred}{\mathit{back}}_t \). We require that only one of these variables can be non-zero. That means we need to add the complementarity condition \[\color{darkred}{\mathit{inv}}_t\cdot \color{darkred}{\mathit{back}}_t = 0\] In many cases when we minimize cost this condition is not needed, as the model will automatically have not both variables nonzero. Unfortunately, in this case the model may not always do this, just to get some early reordering, so we need to enforce this explicitly. The complementarity condition can be linearized as: \[\begin{align} & \color{darkred}{\mathit{inv}}_t \le  \color{darkblue}{\mathit{invCap}} \cdot \color{darkred}\delta_t \\ & \color{darkred}{\mathit{back}}_t \le  \color{darkblue}{\mathit{maxBackLogged}} \cdot (1-\color{darkred}\delta_t) \\ & \color{darkred}\delta_t \in \{0,1\}\end{align}    \] In GAMS, when indexing outside the domain, a zero is returned. That means \(\color{darkred}{\mathit{inv}}_{0}=\color{darkred}{\mathit{back}}_{0}=0\). To handle initial inventory, we use the term \(\color{darkblue}{\mathit{initInv}}_t\). This is a sparse vector with only a nonzero value in the first position. This trick makes it easy to have a single constraint for all time periods as opposed to handling the first period differently. 

    3. To detect low inventory, i.e. when \(\color{darkred}{\mathit{inv}}_t \le \color{darkred}R\), we add a binary variable \(\color{darkred}{\mathit{low}}_t \in \{0,1\}\). We need to make sure: \[\begin{cases}\color{darkred}{\mathit{inv}}_t \le \color{darkred}R \Rightarrow  \color{darkred}{\mathit{low}}_t  = 1 \\ \color{darkred}{\mathit{inv}}_t \ge \color{darkred}R + 1 \Rightarrow  \color{darkred}{\mathit{low}}_t  = 0 \end{cases}\] I implemented this as  \[\begin{align} &  \color{darkred}{\mathit{inv}}_t \le \color{darkred}R + \color{darkblue}{\mathit{invCap}} \cdot (1-\color{darkred}{\mathit{low}}_t ) \\ &\color{darkred}{\mathit{inv}}_t \ge \color{darkred}R + 1 - (\color{darkblue}{\mathit{invCap}}+1)\cdot \color{darkred}{\mathit{low}}_t\end{align}  \] This implies \(\color{darkred}R \le \color{darkblue}{\mathit{invCap}} \), which is a proper bound. 

    4. A reorder event takes place when inventory drops below \(\color{darkred}R\). This means we should trigger a reorder only when \(\color{darkred}{\mathit{low}}_{t-1}=0\) and \(\color{darkred}{\mathit{low}}_{t}=1\). In terms of our model, this condition can be written as \[\color{darkred}{\mathit{order}}_t = (1-\color{darkred}{\mathit{low}}_{t-1}) \cdot \color{darkred}{\mathit{low}}_t\] I implemented a linearization of this as: \[\begin{align} & \color{darkred}{\mathit{order}}_t  \le  1-\color{darkred}{\mathit{low}}_{t-1} \\ &   \color{darkred}{\mathit{order}}_t  \le  \color{darkred}{\mathit{low}}_t \\ & \color{darkred}{\mathit{order}}_t  \ge \color{darkred}{\mathit{low}}_t -\color{darkred}{\mathit{low}}_{t-1} \end{align}\] Note that this rule implies that after replenishment the inventory must exceed \(\color{darkred}R\) (if not, we cannot reorder). 

    5. Inventory replenishment takes place after an order was placed and when the lead time passed. We can write this as \[\color{darkred}{\mathit{repl}}_t  = \color{darkred}Q  \cdot \color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}}\] with \(\color{darkred}{\mathit{repl}}_t \in [0,\color{darkblue}{\mathit{maxQ}}]\). I linearized this as: \[ \begin{align} &  \color{darkred}{\mathit{repl}}_t \le \color{darkblue}{\mathit{maxQ}} \cdot \color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}} \\ & \color{darkred}{\mathit{repl}}_t \le \color{darkred}Q  \\ & \color{darkred}{\mathit{repl}}_t \ge  \color{darkred}Q - \color{darkblue}{\mathit{maxQ}}\cdot (1-\color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}}) \end{align}\] For simplicity, I will assume that no ordering happened just before \(t=1\). In other words, all \(\color{darkred}{\mathit{order}}_t=0\) when \(t \le 0\). That will make a model a little bit messier, but you will need to address this in a real, practical model.

    OK, this was a lot of things to digest. Let's summarize the model:

    MIP Model
    \[\begin{align} \min \> & \color{darkred} z = \color{darkblue}{\mathit{orderCost}}\cdot\sum_t \color{darkred}{\mathit{order}}_t+\color{darkblue}{\mathit{holdCost}}\cdot \sum_t \color{darkred}{\mathit{inv}}_t+\color{darkblue}{\mathit{backlogCost}} \cdot\sum_t \color{darkred}{\mathit{back}}_t && (1) \\ & \color{darkred}{\mathit{inv}}_t-\color{darkred}{\mathit{back}}_t = \color{darkred}{\mathit{inv}}_{t-1}-\color{darkred}{\mathit{back}}_{t-1} - \color{darkblue}{\mathit{demand}}_t + \color{darkred}{\mathit{repl}}_t + \color{darkblue}{\mathit{initInv}}_t && (2) \\ & \color{darkred}{\mathit{inv}}_t \le \color{darkblue}{\mathit{invCap}} \cdot \color{darkred}\delta_t \\ & \color{darkred}{\mathit{back}}_t \le \color{darkblue}{\mathit{maxBackLogged}} \cdot (1-\color{darkred}\delta_t) \\ & \color{darkred}{\mathit{inv}}_t \le \color{darkred}R + \color{darkblue}{\mathit{invCap}} \cdot (1-\color{darkred}{\mathit{low}}_t ) && (3) \\ &\color{darkred}{\mathit{inv}}_t \ge \color{darkred}R + 1 - (\color{darkblue}{\mathit{invCap}}+1)\cdot \color{darkred}{\mathit{low}}_t \\ & \color{darkred}{\mathit{order}}_t \le 1-\color{darkred}{\mathit{low}}_{t-1} && (4) \\ &\color{darkred}{\mathit{order}}_t \le \color{darkred}{\mathit{low}}_t \\ & \color{darkred}{\mathit{order}}_t \ge \color{darkred}{\mathit{low}}_t -\color{darkred}{\mathit{low}}_{t-1} \\ & \color{darkred}{\mathit{repl}}_t \le \color{darkblue}{\mathit{maxQ}} \cdot \color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}} && (5) \\ & \color{darkred}{\mathit{repl}}_t \le \color{darkred}Q \\ & \color{darkred}{\mathit{repl}}_t \ge  \color{darkred}Q - \color{darkblue}{\mathit{maxQ}}\cdot (1-\color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}}) \\ & \color{darkred}Q \in [0,\color{darkblue}{\mathit{maxQ}}] \\ & \color{darkred}R \in [0,\color{darkblue}{\mathit{invCap}}] \\ & \color{darkred}{\mathit{inv}}_t \in [0,\color{darkblue}{\mathit{invCap}}] \\ & \color{darkred}{\mathit{back}}_t \in [0,\color{darkblue}{\mathit{maxBackLogged}}] \\ & \color{darkred}\delta_t \in \{0,1\} \\ & \color{darkred}{\mathit{low}}_t \in \{0,1\} \\ & \color{darkred}{\mathit{order}}_t \in \{0,1\} \\ & \color{darkred}{\mathit{repl}}_t \in [0,\color{darkblue}{\mathit{maxQ}}] \end{align}\]

    In my experiment, all decision variables were binary or integer variables. This is optional: we can use continuous variables for \(\color{darkred}Q\) and \(\color{darkred}R\) and related variables, if this is more appropriate.


    Detailed results can look like:

    ----    163 PARAMETER result  
                     demand         inv     backlog       below     reorder   replenish
    t1                   17         483
    t2                   84         399
    t3                   55         344                       1           1
    t4                   30         314                       1
    t5                   29         285                       1
    t6                   22         263                       1
    t7                   34         229                       1
    t8                   85         144                       1
    t9                    6         138                       1
    t10                  50          88                       1
    t11                  99                      11           1
    t12                  57                      68           1
    t13                  99         345                                             512
    t14                  76         269                       1           1
    t15                  13         256                       1
    t16                  63         193                       1
    t17                  15         178                       1
    t18                  25         153                       1
    t19                  66          87                       1
    t20                  43          44                       1
    t21                  35           9                       1
    t22                  35                      26           1
    t23                  13                      39           1
    t24                  15         458                                             512
    t25                  58         400
    t26                  83         317                       1           1
    t27                  23         294                       1
    t28                  66         228                       1
    t29                  77         151                       1
    t30                  30         121                       1
    t31                  11         110                       1
    t32                  50          60                       1
    t33                  16          44                       1
    t34                  87                      43           1
    t35                  26                      69           1
    t36                  28         415                                             512
    t37                  59         356
    t38                  72         284                       1           1
    t39                  62         222                       1
    t40                  46         176                       1
    t41                  41         135                       1
    t42                  11         124                       1
    t43                  31          93                       1
    t44                   4          89                       1
    t45                  33          56                       1
    t46                  18          38                       1
    t47                  64                      26           1
    t48                  56         430                                             512
    t49                  76         354
    t50                  29         325                       1           1
    t51                  66         259                       1
    t52                  75         184                       1
    t53                  62         122                       1
    t54                  28          94                       1
    t55                   8          86                       1
    t56                  10          76                       1
    t57                  64          12                       1
    t58                  54                      42           1
    t59                   3                      45           1
    t60                  79         388                                             512
    t61                   7         381
    t62                  17         364
    t63                  52         312                       1           1
    t64                  75         237                       1
    t65                  17         220                       1
    t66                   3         217                       1
    t67                  58         159                       1
    t68                  62          97                       1
    t69                  38          59                       1
    t70                  35          24                       1
    t71                  24                                   1
    t72                  24                      24           1
    t73                  13         475                                             512
    t74                  93         382
    t75                  37         345
    t76                  78         267                       1           1
    t77                  30         237                       1
    t78                  12         225                       1
    t79                  74         151                       1
    t80                   6         145                       1
    t81                  20         125                       1
    t82                             125                       1
    t83                  26          99                       1
    t84                  49          50                       1
    t85                  15          35                       1
    t86                  17         530                                             512
    t87                  33         497
    t88                  31         466
    t89                  32         434
    t90                  96         338                       1           1
    t91                  99         239                       1
    t92                  36         203                       1
    t93                  37         166                       1
    t94                  77          89                       1
    t95                  39          50                       1
    t96                  91                      41           1
    t97                  11                      52           1
    t98                  73                     125           1
    t99                   5                     130           1
    t100                 57         325                       1                     512
    Q                   512
    R                   344
    numorder              8
    holdcost          37580
    ordercost          4000
    backlogcost        7410

    A picture of this:

    This looks reasonable to me. 

    Note that in our model the reorder point \(\color{darkred}R\) (straight light-blue line) can not be higher than the inventory after replenishment (see point 4 above). The first replenishment is exactly demonstrating this bound. Of course, we can refine the model to relax this requirement. 


    With some effort, we can form a linear MIP model that determines the optimal values for \((\color{darkred}Q, \color{darkred}R)\) for any (simulated) demand pattern. We needed a few modeling tricks to formulate this problem as a proper linear MIP model, however. This requires a number of big-M constraints. Using indicator constraints and/or quadratic terms (available in some advanced solvers) can make the model simpler. Our model does not make assumptions about special facilities available in the MIP solver, so this model should work with any (good) MIP solver.


    Tuesday, October 27, 2020

    Finding a cluster of closest points

    Problem statement

    From [1]:

    Consider \(n=100\) points in an 12 dimensional space. Find \(m=8\) points such that they are as close as possible.


    The problem can be stated as a simple MIQP (Mixed Integer Quadratic Programming) model.

    \[\begin{align} \min & \sum_{i\lt j} \color{darkred}x_i \cdot \color{darkred}x_j \cdot \color{darkblue}{\mathit{dist}}_{i,j} \\ & \sum_i \color{darkred}x_i = \color{darkblue}m \\ & \color{darkred}x_i \in \{0,1\} \end{align}\]

    This is a simple model. The main wrinkle is that we want to make sure that we only count each distance once. For this reason, we only consider distances with \(i \lt j\). Of course, we can exploit this also in calculating the distance matrix \( \color{darkblue}{\mathit{dist}}_{i,j}\), and make this a strictly upper-triangular matrix. This saves time and memory.

    As the \( \color{darkred}x\) variables are binary, we can easily linearize this problem:

    MINSUM MIP Model
    \[\begin{align} \min & \sum_{i\lt j} \color{darkred}y_{i,j} \cdot \color{darkblue}{\mathit{dist}}_{i,j} \\ & \color{darkred}y_{i,j} \ge \color{darkred}x_i + \color{darkred}x_j - 1 && \forall i \lt j\\ & \sum_i \color{darkred}x_i = \color{darkblue}m \\ & \color{darkred}x_i \in \{0,1\} \\ &\color{darkred}y_{i,j} \in [0,1] \end{align}\]

    The inequality implements the implication \[\color{darkred}x_i= 1 \textbf{ and } \color{darkred}x_j = 1 \Rightarrow \color{darkred}y_{i,j} = 1 \] The variables \(\color{darkred}y_{i,j}\) can be binary or can be relaxed to be continuous between 0 and 1. The MIQP and MIP model should deliver the same results. 

    Finally, we can also consider a slightly different problem. Instead of minimizing the sum of the distances between points inside the cluster of the selected points, we can also minimize the maximum distance between points inside the cluster. The (linear) model can look like:

    MINMAX MIP Model
    \[\begin{align} \min\> & \color{darkred}z \\ & \color{darkred}z \ge \color{darkred}y_{i,j} \cdot \color{darkblue}{\mathit{dist}}_{i,j} && \forall i \lt j \\ & \color{darkred}y_{i,j} \ge \color{darkred}x_i + \color{darkred}x_j - 1 && \forall i \lt j\\ & \sum_i \color{darkred}x_i = \color{darkblue}m \\ & \color{darkred}x_i \in \{0,1\} \\ &\color{darkred}y_{i,j} \in [0,1] \end{align}\]

    We re-use our linearization here.

    Finally, I also tried a very simple greedy heuristic:
    1. Select the two points that are closest to each other.
    2. Select a new unselected point that is closest to our already selected points.
    3. Repeat step 2 until we have selected 8 points.
    We can expect our optimization models to do a bit better than this simplistic approach. 

    In our methods here, we did not make any assumptions about the distance metric being used We just needed the distance matrix. This means you can use Euclidean, Manhattan, or other metrics. In addition, you can use some normalization or weighting before calculating the distances. This may be useful if the features have different units. These models also do not change whether one uses low- or high-dimensional data.

    There are alternative models one could envision, such as finding the smallest enclosing sphere or box. These methods do not make things simpler and will likely not improve upon our models. 

    Small data set

    Let's start with selecting \(m=8\) points from  \(n=50\), using random 2d coordinates.

    ----     10 PARAMETER coord  coordinates
                      x           y
    point1        0.172       0.843
    point2        0.550       0.301
    point3        0.292       0.224
    point4        0.350       0.856
    point5        0.067       0.500
    point6        0.998       0.579
    point7        0.991       0.762
    point8        0.131       0.640
    point9        0.160       0.250
    point10       0.669       0.435
    point11       0.360       0.351
    point12       0.131       0.150
    point13       0.589       0.831
    point14       0.231       0.666
    point15       0.776       0.304
    point16       0.110       0.502
    point17       0.160       0.872
    point18       0.265       0.286
    point19       0.594       0.723
    point20       0.628       0.464
    point21       0.413       0.118
    point22       0.314       0.047
    point23       0.339       0.182
    point24       0.646       0.561
    point25       0.770       0.298
    point26       0.661       0.756
    point27       0.627       0.284
    point28       0.086       0.103
    point29       0.641       0.545
    point30       0.032       0.792
    point31       0.073       0.176
    point32       0.526       0.750
    point33       0.178       0.034
    point34       0.585       0.621
    point35       0.389       0.359
    point36       0.243       0.246
    point37       0.131       0.933
    point38       0.380       0.783
    point39       0.300       0.125
    point40       0.749       0.069
    point41       0.202       0.005
    point42       0.270       0.500
    point43       0.151       0.174
    point44       0.331       0.317
    point45       0.322       0.964
    point46       0.994       0.370
    point47       0.373       0.772
    point48       0.397       0.913
    point49       0.120       0.735
    point50       0.055       0.576

    The MINSUM MIQP and MIP model gives the same solution, but (as expected) the MINMAX model is slightly different. Here are the results with Cplex:

              HEURISTIC        MIQP         MIP      MINMAX
    point2        1.000
    point3                    1.000       1.000       1.000
    point9                                            1.000
    point10       1.000
    point11                   1.000       1.000
    point12                                           1.000
    point15       1.000
    point18                   1.000       1.000       1.000
    point20       1.000
    point23                   1.000       1.000       1.000
    point24       1.000
    point25       1.000
    point27       1.000
    point29       1.000
    point35                   1.000       1.000
    point36                   1.000       1.000       1.000
    point39                   1.000       1.000       1.000
    point43                                           1.000
    point44                   1.000       1.000
    status                  Optimal     Optimal     Optimal
    obj                       3.447       3.447       0.210
    sum           4.997       3.447       3.447       3.522
    max           0.291       0.250       0.250       0.210
    time                     33.937       2.125       0.562

    When we plot the results, we see:

    Obviously, our optimization models do much better than the heuristic. Two optimization models have quite an overlap in the selected points. To compare the objectives MINSUM and MINMAX we can plot them against each other:

    Obviously, our heuristic is not doing that great.

    Large data set 

    Here we select \(m=8\) points from  \(n=100\), using random 12d coordinates. Using Gurobi on a faster machine we see:

    ----    112 PARAMETER results  
              HEURISTIC        MIQP         MIP      MINMAX
    point5        1.000                               1.000
    point8                    1.000       1.000
    point17       1.000                               1.000
    point19                   1.000       1.000
    point24                                           1.000
    point35                   1.000       1.000
    point38                   1.000       1.000
    point39                                           1.000
    point42       1.000                               1.000
    point43                                           1.000
    point45       1.000       1.000       1.000
    point51       1.000
    point56       1.000
    point76       1.000
    point81                   1.000       1.000
    point89                   1.000       1.000
    point94       1.000       1.000       1.000       1.000
    point97                                           1.000
    status                TimeLimit     Optimal     Optimal
    obj                      26.230      26.230       1.147
    sum          30.731      26.230      26.230      27.621
    max           1.586       1.357       1.357       1.147
    time                   1015.984      49.000       9.594
    gap%                     23.379

    The MINSUM MIQP model hit a time limit of 1000 seconds but actually found the optimal solution. It just did not have the time to prove it. The overlap between the MINSUM models and the MINMAX model is very small (just one shared point). I expected more overlap.


    The MINMAX model seems to solve faster than the MINSUM versions. This is a little bit surprising: often these max or min objectives are slower than direct sums. It always pays off to do some experiments with different formulations. Slightly different models can show a large difference in solution times. This is a great example of that.

    These models are independent of the dimensionality of the data and the distance metric being used.


    1. Given a set of points or vectors, find the set of N points that are closest to each other,

    Appendix: GAMS code

    Some interesting features:
    • There are not that many models where I can use xor operators. Here it is used in the Greedy Heuristic where we want to consider points \(i\) and \(j\) where one of them is in the cluster and one outside. 
    • Of course instead of the condition that x.l(i) xor x.l(j) is nonzero, we also could have required that x.l(i)+x.l(j)=1. I just could not resist the xor.
    • Macros are used to prevent repetitive code in the reporting of results.
    • Acronyms are used in reporting. 


    find cluster of m=8 points closest to each other

    simple greedy heuristic (HEURISTIC)
    quadratic model (MIQP)
    linearized model (MIP)
    minmax model (MINMAX)


    'coordinates' /x,y/
    'points' /point1*point50/

    scalar m 'number of points to select' /8/;

    parameter coord(i,c) 'random coordinates';
    coord(i,c) = uniform(0,1);
    display coord;

    set ij(i,j) 'upper triangular structure';
    ij(i,j) =
    ord(i) < ord(j);

    parameter dist(i,j) 'euclidean distances';
    dist(ij(i,j)) = sqrt(
    sum(c, sqr(coord(i,c)-coord(j,c))));

    binary variables x(i) 'selected points';
    positive variable y(i,j) 'x(i)*x(j), relaxed to be continuous';

    variable z 'objective';

    'quadratic objective'
    'linearized objective'
    'minmax objective'
    'number of selected points'
    'implication x(i)=x(j)=1 ==> y(i,j)=1'

    obj1.. z =e=
    sum(ij(i,j), x(i)*x(j)*dist(ij));
    obj2.. z =e=
    sum(ij, y(ij)*dist(ij));
    obj3(ij).. z =g= y(ij)*dist(ij);
    sum(i, x(i)) =e= m;
    bound(ij(i,j))..  y(i,j) =g= x(i) + x(j) - 1;

    model m1 /obj1,select/;
    model m2 /obj2,select,bound/;
    model m3 /obj3,select,bound/;
    option optcr=0,mip=cplex,miqcp=cplex,threads=8;

    * reporting macros

    parameter results(*,*);

    set dummy 'ordering for output' /  status, obj, sum, max, time, 'gap%' /;

    acronym TimeLimit;
    acronym Optimal;

    * macros for reporting
    $macro sumdist    sum(ij(i,j), x.l(i)*x.l(j)*dist(ij))
    $macro maxdist    smax(ij(i,j), x.l(i)*x.l(j)*dist(ij))
    $macro report(m,label)  \
        x.l(i) = round(x.l(i));  \
        results(i,label) = x.l(i); \
    'status',label)$(m.solvestat=1) = Optimal; \
    'status',label)$(m.solvestat=3) = TimeLimit; \
    'obj',label) = z.l; \
    'sum',label) = sumdist; \
    'max',label) = maxdist; \
    'time',label) = m.resusd; \
    'gap%',label)$(m.solvestat=3) = 100*abs(m.objest - m.objval)/abs(m.objest);

    * heuristic

    * step 1 : select points 1 and 2 by minimizing distance
    scalar dmin,dmin1,dmin2;
    dmin =

    * add points 3..m, closest to points selected earlier
    scalar k;
    for(k = 3 to m,
       dmin =
    smin(ij(i,j)$(x.l(i) xor x.l(j)), dist(ij));

    'HEURISTIC') = x.l(i);
    'sum','HEURISTIC') = sumdist;
    'max','HEURISTIC') = maxdist;

    * run models m1 through m3

    solve m1 minimizing z using miqcp;

    solve m2 minimizing z using mip;

    solve m3 minimizing z using mip;

    display results;