Yet Another Math Programming Consultant: November 2020

Monday, November 30, 2020

QuadProg: usage issues

The QuadProg QP (Quadratic Programming) solver, based on an algorithm by Goldfarb and Idnani [1,2], is quite popular. It is available for R [3] and Python [4]. However, it has some real drawbacks. Some of it is just "user interface" and documentation: how to call the solver.

The mathematical model the solver works with is [3]: \[\begin{align} \min\>& -\color{darkblue}d^T\color{darkred}b + \frac{1}{2}\color{darkred}b^T\color{darkblue}D\color{darkred}b \\ & \color{darkblue}A^T\color{darkred}b \begin{bmatrix}= \\ \ge \end{bmatrix} \color{darkblue}b_0 \end{align} \]

This immediately poses some questions:

Variables are typically called $\color{darkred}x$, and $\color{darkblue}b$ is usually a constant. Why the different notation here?
I don't know. Maybe to make sure you pay attention.
How to handle bounds?
The variable type the solver supports is called "free" (i.e. unrestricted). This is different than for many LP solvers, where the default variable type is positive. For non-negative variables, you need to explicitly add the constraints $\color{darkred}b \ge 0$. Similar for other bounds.
Why the different signs for the linear part and the quadratic part in the objective?
I don't know.
Why the constant $1/2$ in the objective?
We see this more often in QP solvers. This is because the gradient of the quadratic term $\color{darkred}x^T\color{darkblue}Q\color{darkred}x$ is $2\color{darkblue}Q\color{darkred}x$. The algorithm prefers to work with $\color{darkblue}Q\color{darkred}x$ however. So in order to stay really close to the algorithm, the user is asked to multiply all quadratic coefficients $\color{darkblue}D_{i,j}$ by 2.
Most QP solvers call the matrix with quadratic coefficients $\color{darkblue}Q$. Why is it called $\color{darkblue}D$ here?
Don't know.
Does $\color{darkblue}D$ have to be symmetric?
The documentation does not tell, but the answer is: only the diagonal and upper-triangular part of $\color{darkblue}D$ are used. This means the matrix is assumed to be symmetric. E.g. when passing \[\color{darkblue}D = \begin{pmatrix} 1 & 0 \\ 100 & 1 \end{pmatrix}\] the solver thinks it received \[\color{darkblue}D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\]
Can $\color{darkblue}D$ be positive semi-definite?
In QuadProg, $\color{darkblue}D$ has to be strictly positive definite.
Defnition: A matrix $Q$ is called positive semi-definite when $x^TQx\ge 0$ for all $x$. It is called positive definite when $x^TQx\gt 0$ for all $x \ne 0$. A positive definite matrix has strictly positive eigenvalues. I.e., the smallest eigenvalue of $\color{darkblue}D$ must be $\lambda_{min}\gt 0$. For a positive semi-definite matrix all eigenvalues are non-negative (i.e., zero is allowed).
The fact that QuadProg requires a strict positive definite matrix, means that all variables need to be quadratic and also that this algorithm can not solve LPs. Most QP solvers (in fact all the ones I know beyond QuadProg) only require positive semi-definite quadratic coefficient matrices. I.e., for linear variables we can use coefficients $\color{darkblue}q_{i,j}=0$ (where $i$ or $j$ corresponds to a linear variable). Quadprog is stricter. The whole matrix $\color{darkblue}D$ most be strictly positive definite. You should be aware of these restrictions. And the documentation [3] should have mentioned this.
Can I trick QuadProg to accept linear variables?
Yes, suppose we have 2 quadratic variables and 2 linear variables. Then form the $\color{darkblue}D$ matrix as \[\color{darkblue}D = \begin{pmatrix} \color{darkblue}d_{1,1} &\color{darkblue}d_{1,2} & 0 & 0 \\ \color{darkblue}d_{2,1} &\color{darkblue}d_{2,2} & 0 & 0 \\ 0 & 0 &\color{darkblue} \varepsilon & 0 \\ 0 & 0 & 0 &\color{darkblue} \varepsilon \end{pmatrix}\] where $\color{darkblue} \varepsilon\gt 0$ is a small constant. This trick would also allow us to feed the algorithm linear problems.
Usually, solvers express linear constraints as $\color{darkblue}A\color{darkred}x=\color{darkblue}b$. Why do we have a transposed matrix $\color{darkblue}A^T$ here?
I suspect this is closer to what the dual algorithm prefers. Again, making the input closer to what the solver wants.
How to input $\le$ constraints?
Multiply these by $-1$ and make them $\ge$ constraints.

These are some of the questions I have seen in the past about QuadProg. Hopefully, this at least demystifies some of them.

As an alternative to QuadProg, you may want to have a look at CVXR or CVXPY for modeling QPs. These tools can talk to different solvers and will apply the necessary transformations to make the model suited for the selected solver. The modeling is also a bit more natural than the matrix input for QuadProg. For non-trivial quadratic programming models, QuadProg may not be the best choice.

References

D. Goldfarb and A. Idnani (1982). Dual and Primal-Dual Methods for Solving Strictly Convex Quadratic Programs. In J. P. Hennart (ed.), Numerical Analysis, Springer-Verlag, Berlin, pages 226–239.
D. Goldfarb and A. Idnani (1983). A numerically stable dual method for solving strictly convex quadratic programs. Mathematical Programming, 27, 1–33.
R package quadprog, https://cran.r-project.org/web/packages/quadprog/quadprog.pdf
Python package quadprog, https://pypi.org/project/quadprog/

Monday, November 23, 2020

Machine scheduling: Maintenance and maximum runlength

In [1] a problem is posed that deals with a maximum run-length (6 days) before the machine needs to be taken off-line for maintenance (1 day). Consider the binary decision variable:

\[\color{darkred}y_t = \begin{cases} 1 & \text{if machine is online at period $t$} \\ 0 & \text{otherwise} \end{cases} \]

In [1] a scheme is proposed as follows. Introduce an integer variable $\color{darkred}z_t$ that has the cumulative days the machine is running. Then apply the rule:

if $\color{darkred}z\gt 6$ then $\color{darkred}y:=0,\color{darkred}z:=0$

This approach needs a bit of repair to make it work for a MIP model. A MIP model can not contain assignments, and decision variables are part of a system of linear equations. I.e., all decision variables are essentially calculated at the same time (a notion which can be foreign for computer programmers who think in sequential assignment statements). We can write:\[\begin{align} & \color{darkred}y_t = 1 \Rightarrow \color{darkred}z_t = \color{darkred}z_{t-1} + 1 \\ & \color{darkred}y_t = 0 \Rightarrow \color{darkred}z_t = 0 \\ & \color{darkred}z_t \in \{0,\dots,6\}\end{align}\] This is in the form of implications that can be used directly with (advanced) solvers that support indicator constraints (i.e., a binary variable on the left and a linear constraint on the right). Note that we don't do anything special when $\color{darkred}z_t$ becomes 7. We just forbid that from happening by setting an upper bound of 6. The initial value $\color{darkblue}z_0$ is the number of periods the machine is already online at the beginning of the planning period.

If your solver or modeling tool does not have support for indicator constraints, we can rewrite things as: \[\begin{align}& \color{darkblue}z_t^{up} := 6 \\ & \color{darkred}z_t \le \color{darkred} y_t \cdot \color{darkblue}z_t^{up}\\ & \color{darkred}z_t \le \color{darkred}z_{t-1} + 1 \\ & \color{darkred}z_t \ge \color{darkred}z_{t-1} + 1 - (1-\color{darkred}y_t ) \cdot (\color{darkblue}z_t^{up}+1)\\ & \color{darkred}z_t \in \{0,\dots,\color{darkblue}z_t^{up} \}\end{align}\] Here $ \color{darkblue}z_t^{up} =6$ is the upper bound on $ \color{darkred}z_t$. This formulation requires a little bit of thinking. Finally, I want to mention that $ \color{darkred}z_t$ can be declared as a continuous variable: it will be integer automatically.

A simpler alternative formulation is to just forbid any sequence of 7 consecutive 1's for $\color{darkred}y$. I.e. \[\color{darkred}y_t+\color{darkred}y_{t+1}+\color{darkred}y_{t+2}+\color{darkred}y_{t+3}+\color{darkred}y_{t+4}+\color{darkred}y_{t+5}+\color{darkred}y_{t+6} \le 6\] It may be easier to look backward: \[\color{darkred}y_t+\color{darkred}y_{t-1}+\color{darkred}y_{t-2}+\color{darkred}y_{t-3}+\color{darkred}y_{t-4}+\color{darkred}y_{t-5}+\color{darkred}y_{t-6} \le 6\] The terms $\color{darkred}y_{t-k}$ can have a non-positive index $t-k$ which means this is historic data, from before our planning period. In other words $\color{darkblue}y_0,\color{darkblue}y_{-1},\color{darkblue}y_{-2},\dots$ is data.

Conclusion

This is a good example where we can achieve the same result using two very different approaches. It may make sense to choose the simplest formulation.

References

Linear Programming - Re-setting a variable based on it's cumulative count, https://stackoverflow.com/questions/64930027/linear-programming-re-setting-a-variable-based-on-its-cumulative-count

Friday, November 20, 2020

OR Scientist Job @ Amobee

The Operations Research Scientist is responsible for collaborating with the team to design and implement systems that focus on forecasting, allocating, delivering, and reporting of advertising on the internet, TV, and connected devices. The successful candidate will utilize advanced mathematical, optimization, machine-learning, software engineering, and programming skills to solve challenging problems.

Responsibilities

Develop innovative mathematical programming (LP, MILP, …) solutions to run efficiently and at scale
Develop innovative forecasts, models, algorithms, and solutions and build prototypes and production implementations
Work cross-functionally to gather feedback and requirements from end users
Be a key contributor to product innovation and company strategy
Communicate technical information clearly to both technical and non-technical people

Qualifications:

Degree in Operations Research, Mathematics, Statistics, Industrial Engineering, Mechanical Engineering, Physics, or related field; MS/PhD strongly preferred
3+ years of relevant experience
Strong experience with implementing decomposition techniques, column generation preferred.
Strong analytical, statistical, and math abilities; Exceptional problem-solving skills
Excellent communication skills
Extreme attention to detail
Ability to work as part of a team in a fast-paced environment
Familiarity with GAMS, Gurobi/Cplex, CVX, Python, R, Java, Scala, SQL

More information: https://www.amobee.com/company/careers/senior-operation-research-scientist/

Friday, November 6, 2020

Gurobi 9.1 Performance Improvement and New Features.

Some really good performance improvements. Striking are the numbers for non-convex MIQCP models.

Performance Improvements

With Gurobi 9.1, the Gurobi Optimizer registered notable performance improvements across multiple problem types including:

Primal simplex: 17% faster overall, 37% faster on models that take at least 100 seconds.
Dual simplex: 29% faster overall, 66% faster on models that take at least 100 seconds.
Barrier: 15% faster overall, 34% faster on models that take at least 100 seconds.
Mixed-integer linear programming (MILP): 5% faster overall, 9% faster on models that take at least 100 seconds.
Convex mixed-integer quadratic programming (MIQP): 5% faster overall, 20% faster on models that take at least 100 seconds.
Convex mixed-integer quadratically constrained programming (MIQCP): 13% faster overall, 57% faster on models that take at least 100 seconds.
Non-convex mixed-integer quadratically constrained programming (non-convex MIQCP): 4x faster overall, 9x faster on models that take at least 100 seconds.
Irreducible Infeasible Subset (IIS) computation: 2.6x faster overall, 5.7x faster on models that take at least 100 seconds.
Better MIP feasible solutions: Heuristics are significantly better at finding high-quality solutions earlier.

New Features

The new features in the release include:

NoRel Heuristic: This new heuristic finds high-quality solutions in situations where the linear programming (LP) relaxation of the mixed-integer programming (MIP) problem is too expensive to solve.
Integrality Focus: This new feature allows users to be much stricter on integrality constraints, thus avoiding many undesirable results (including trickle flows) that can come from small integrality violations.
Python Matrix API Enhancements: Gurobi’s Python interface – gurobipy – has been extended and improved to better support matrix-oriented modeling.
Pip Install Support: Users can now utilize pip, a Python tool, to install Gurobi in their Python environment.

Optimal (Q,R) Inventory Policy as a MIP

In [1] the question was posed: "Can we formulate a (Q,R) inventory model as an integer programming model?" Of course, I said: Yes. But we need to be a bit careful if we want to make it a linear model. Let's check if I was just bluffing here...

(Q,R) Inventory Policy

The (Q,R) inventory model is as follows: if the inventory falls below R, place an order for Q. In addition, we deal with:

Lead times: when we placed an order it takes some time to be delivered
When inventory becomes zero, additional demand will cause a backlog. Backlogged demand will be fulfilled when replenishments arrive but at a cost.
Costs are:

Fixed ordering cost
Holding cost related to inventory
Penalties related to backlogs

Data

We follow [1] here. Demand is drawn randomly from a discrete uniform distribution between 0 and 99.

----     11 PARAMETER demand  random data

t1   17,    t2   84,    t3   55,    t4   30,    t5   29,    t6   22,    t7   34,    t8   85,    t9    6,    t10  50
t11  99,    t12  57,    t13  99,    t14  76,    t15  13,    t16  63,    t17  15,    t18  25,    t19  66,    t20  43
t21  35,    t22  35,    t23  13,    t24  15,    t25  58,    t26  83,    t27  23,    t28  66,    t29  77,    t30  30
t31  11,    t32  50,    t33  16,    t34  87,    t35  26,    t36  28,    t37  59,    t38  72,    t39  62,    t40  46
t41  41,    t42  11,    t43  31,    t44   4,    t45  33,    t46  18,    t47  64,    t48  56,    t49  76,    t50  29
t51  66,    t52  75,    t53  62,    t54  28,    t55   8,    t56  10,    t57  64,    t58  54,    t59   3,    t60  79
t61   7,    t62  17,    t63  52,    t64  75,    t65  17,    t66   3,    t67  58,    t68  62,    t69  38,    t70  35
t71  24,    t72  24,    t73  13,    t74  93,    t75  37,    t76  78,    t77  30,    t78  12,    t79  74,    t80   6
t81  20,    t83  26,    t84  49,    t85  15,    t86  17,    t87  33,    t88  31,    t89  32,    t90  96,    t91  99
t92  36,    t93  37,    t94  77,    t95  39,    t96  91,    t97  11,    t98  73,    t99   5,    t100 57

We assume demand occurs during period $t$, i.e. it is a "flow" quantity (opposed to "stock variables"). Note that demand for time period 82 is zero (this is not printed here).

The other data for the problem looks like:

----     28 PARAMETER invCap               =     1000.000  inventory capacity
            PARAMETER maxQ                 =     1000.000  max order quantity
            PARAMETER orderCost            =      500.000  cost for each order
            PARAMETER holdCost             =        2.000  holding cost (per unit, per time period)
            PARAMETER leadTime             =       10.000  lead time
            PARAMETER backLogCost          =       10.000  backlogged orders penalty (per unit, per time period)
            PARAMETER maxBackLogged        =     1000.000  big-M

----     28 PARAMETER initInv   intial inventory available at t1

t1 500.000

Variables

I need a bunch of decision variables to model this. Let's list them:

Decision variables
$\color{darkred}Q \in [0,\color{darkblue}{\mathit{maxQ}}]$	Reorder quantity. Each time we place an order, this is the size of the order.
$\color{darkred}R \in [0,\color{darkblue}{\mathit{invCap}}]$	Reorder point. We place an order when the inventory drops below $\color{darkred}R$.
$\color{darkred}{\mathit{inv}}_t \in [0,\color{darkblue}{\mathit{invCap}}] $	Inventory at end of period $t$.
$\color{darkred}{\mathit{back}}_t \in [0,\color{darkblue}{\mathit{maxBackLogged}}]$	Backlog, unmet demand we fulfill once new inventory arrives, again at end of period $t$.
$\color{darkred}{\mathit{low}}_t \in \{0,1\}$	Inventory is low, defined by $\color{darkred}{\mathit{inv}}_t \le \color{darkred}R$.
$\color{darkred}{\mathit{order}}_t \in \{0,1\}$	An order is placed. We place an order when $\color{darkred}{\mathit{inv}}_t$ drops below $\color{darkred}R$.
$\color{darkred}{\mathit{repl}}_t \in \{0,\color{darkred}Q\}$	Replenishment. Inventory is replenished by a previously placed order (there is a lead time). This replenishment is also used to handle backlogged demand. $\color{darkred}{\mathit{repl}}_t$ is either zero or equal to the reorder quantity $\color{darkred}Q$.

In the question, it was required that the decision variables are integer-valued. So, I declared variables as integer where needed.

Model equations

1. The objective has three parts: fixed ordering cost, holding cost, and backlog penalities. So we have \[\min \> \color{darkred} z = \color{darkblue}{\mathit{orderCost}}\cdot\sum_t \color{darkred}{\mathit{order}}_t+\color{darkblue}{\mathit{holdCost}}\cdot \sum_t \color{darkred}{\mathit{inv}}_t+\color{darkblue}{\mathit{backlogCost}} \cdot\sum_t \color{darkred}{\mathit{back}}_t\]

2. The inventory balance equation is a bit complex as we allow backlog: \[\color{darkred}{\mathit{inv}}_t-\color{darkred}{\mathit{back}}_t = \color{darkred}{\mathit{inv}}_{t-1}-\color{darkred}{\mathit{back}}_{t-1} - \color{darkblue}{\mathit{demand}}_t + \color{darkred}{\mathit{repl}}_t + \color{darkblue}{\mathit{initInv}}_t \] Instead of just inventory, we update the quantity $\color{darkred}{\mathit{inv}}_t-\color{darkred}{\mathit{back}}_t $. We require that only one of these variables can be non-zero. That means we need to add the complementarity condition \[\color{darkred}{\mathit{inv}}_t\cdot \color{darkred}{\mathit{back}}_t = 0\] In many cases when we minimize cost this condition is not needed, as the model will automatically have not both variables nonzero. Unfortunately, in this case the model may not always do this, just to get some early reordering, so we need to enforce this explicitly. The complementarity condition can be linearized as: \[\begin{align} & \color{darkred}{\mathit{inv}}_t \le \color{darkblue}{\mathit{invCap}} \cdot \color{darkred}\delta_t \\ & \color{darkred}{\mathit{back}}_t \le \color{darkblue}{\mathit{maxBackLogged}} \cdot (1-\color{darkred}\delta_t) \\ & \color{darkred}\delta_t \in \{0,1\}\end{align} \] In GAMS, when indexing outside the domain, a zero is returned. That means $\color{darkred}{\mathit{inv}}_{0}=\color{darkred}{\mathit{back}}_{0}=0$. To handle initial inventory, we use the term $\color{darkblue}{\mathit{initInv}}_t$. This is a sparse vector with only a nonzero value in the first position. This trick makes it easy to have a single constraint for all time periods as opposed to handling the first period differently.

3. To detect low inventory, i.e. when $\color{darkred}{\mathit{inv}}_t \le \color{darkred}R$, we add a binary variable $\color{darkred}{\mathit{low}}_t \in \{0,1\}$. We need to make sure: \[\begin{cases}\color{darkred}{\mathit{inv}}_t \le \color{darkred}R \Rightarrow \color{darkred}{\mathit{low}}_t = 1 \\ \color{darkred}{\mathit{inv}}_t \ge \color{darkred}R + 1 \Rightarrow \color{darkred}{\mathit{low}}_t = 0 \end{cases}\] I implemented this as \[\begin{align} & \color{darkred}{\mathit{inv}}_t \le \color{darkred}R + \color{darkblue}{\mathit{invCap}} \cdot (1-\color{darkred}{\mathit{low}}_t ) \\ &\color{darkred}{\mathit{inv}}_t \ge \color{darkred}R + 1 - (\color{darkblue}{\mathit{invCap}}+1)\cdot \color{darkred}{\mathit{low}}_t\end{align} \] This implies $\color{darkred}R \le \color{darkblue}{\mathit{invCap}} $, which is a proper bound.

4. A reorder event takes place when inventory drops below $\color{darkred}R$. This means we should trigger a reorder only when $\color{darkred}{\mathit{low}}_{t-1}=0$ and $\color{darkred}{\mathit{low}}_{t}=1$. In terms of our model, this condition can be written as \[\color{darkred}{\mathit{order}}_t = (1-\color{darkred}{\mathit{low}}_{t-1}) \cdot \color{darkred}{\mathit{low}}_t\] I implemented a linearization of this as: \[\begin{align} & \color{darkred}{\mathit{order}}_t \le 1-\color{darkred}{\mathit{low}}_{t-1} \\ & \color{darkred}{\mathit{order}}_t \le \color{darkred}{\mathit{low}}_t \\ & \color{darkred}{\mathit{order}}_t \ge \color{darkred}{\mathit{low}}_t -\color{darkred}{\mathit{low}}_{t-1} \end{align}\] Note that this rule implies that after replenishment the inventory must exceed $\color{darkred}R$ (if not, we cannot reorder).

5. Inventory replenishment takes place after an order was placed and when the lead time passed. We can write this as \[\color{darkred}{\mathit{repl}}_t = \color{darkred}Q \cdot \color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}}\] with $\color{darkred}{\mathit{repl}}_t \in [0,\color{darkblue}{\mathit{maxQ}}]$. I linearized this as: \[ \begin{align} & \color{darkred}{\mathit{repl}}_t \le \color{darkblue}{\mathit{maxQ}} \cdot \color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}} \\ & \color{darkred}{\mathit{repl}}_t \le \color{darkred}Q \\ & \color{darkred}{\mathit{repl}}_t \ge \color{darkred}Q - \color{darkblue}{\mathit{maxQ}}\cdot (1-\color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}}) \end{align}\] For simplicity, I will assume that no ordering happened just before $t=1$. In other words, all $\color{darkred}{\mathit{order}}_t=0$ when $t \le 0$. That will make a model a little bit messier, but you will need to address this in a real, practical model.

OK, this was a lot of things to digest. Let's summarize the model:

MIP Model
\[\begin{align} \min \> & \color{darkred} z = \color{darkblue}{\mathit{orderCost}}\cdot\sum_t \color{darkred}{\mathit{order}}_t+\color{darkblue}{\mathit{holdCost}}\cdot \sum_t \color{darkred}{\mathit{inv}}_t+\color{darkblue}{\mathit{backlogCost}} \cdot\sum_t \color{darkred}{\mathit{back}}_t && (1) \\ & \color{darkred}{\mathit{inv}}_t-\color{darkred}{\mathit{back}}_t = \color{darkred}{\mathit{inv}}_{t-1}-\color{darkred}{\mathit{back}}_{t-1} - \color{darkblue}{\mathit{demand}}_t + \color{darkred}{\mathit{repl}}_t + \color{darkblue}{\mathit{initInv}}_t && (2) \\ & \color{darkred}{\mathit{inv}}_t \le \color{darkblue}{\mathit{invCap}} \cdot \color{darkred}\delta_t \\ & \color{darkred}{\mathit{back}}_t \le \color{darkblue}{\mathit{maxBackLogged}} \cdot (1-\color{darkred}\delta_t) \\ & \color{darkred}{\mathit{inv}}_t \le \color{darkred}R + \color{darkblue}{\mathit{invCap}} \cdot (1-\color{darkred}{\mathit{low}}_t ) && (3) \\ &\color{darkred}{\mathit{inv}}_t \ge \color{darkred}R + 1 - (\color{darkblue}{\mathit{invCap}}+1)\cdot \color{darkred}{\mathit{low}}_t \\ & \color{darkred}{\mathit{order}}_t \le 1-\color{darkred}{\mathit{low}}_{t-1} && (4) \\ &\color{darkred}{\mathit{order}}_t \le \color{darkred}{\mathit{low}}_t \\ & \color{darkred}{\mathit{order}}_t \ge \color{darkred}{\mathit{low}}_t -\color{darkred}{\mathit{low}}_{t-1} \\ & \color{darkred}{\mathit{repl}}_t \le \color{darkblue}{\mathit{maxQ}} \cdot \color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}} && (5) \\ & \color{darkred}{\mathit{repl}}_t \le \color{darkred}Q \\ & \color{darkred}{\mathit{repl}}_t \ge \color{darkred}Q - \color{darkblue}{\mathit{maxQ}}\cdot (1-\color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}}) \\ & \color{darkred}Q \in [0,\color{darkblue}{\mathit{maxQ}}] \\ & \color{darkred}R \in [0,\color{darkblue}{\mathit{invCap}}] \\ & \color{darkred}{\mathit{inv}}_t \in [0,\color{darkblue}{\mathit{invCap}}] \\ & \color{darkred}{\mathit{back}}_t \in [0,\color{darkblue}{\mathit{maxBackLogged}}] \\ & \color{darkred}\delta_t \in \{0,1\} \\ & \color{darkred}{\mathit{low}}_t \in \{0,1\} \\ & \color{darkred}{\mathit{order}}_t \in \{0,1\} \\ & \color{darkred}{\mathit{repl}}_t \in [0,\color{darkblue}{\mathit{maxQ}}] \end{align}\]

MIP Model

\[\begin{align} \min \> & \color{darkred} z = \color{darkblue}{\mathit{orderCost}}\cdot\sum_t \color{darkred}{\mathit{order}}_t+\color{darkblue}{\mathit{holdCost}}\cdot \sum_t \color{darkred}{\mathit{inv}}_t+\color{darkblue}{\mathit{backlogCost}} \cdot\sum_t \color{darkred}{\mathit{back}}_t && (1) \\ & \color{darkred}{\mathit{inv}}_t-\color{darkred}{\mathit{back}}_t = \color{darkred}{\mathit{inv}}_{t-1}-\color{darkred}{\mathit{back}}_{t-1} - \color{darkblue}{\mathit{demand}}_t + \color{darkred}{\mathit{repl}}_t + \color{darkblue}{\mathit{initInv}}_t && (2) \\ & \color{darkred}{\mathit{inv}}_t \le \color{darkblue}{\mathit{invCap}} \cdot \color{darkred}\delta_t \\ & \color{darkred}{\mathit{back}}_t \le \color{darkblue}{\mathit{maxBackLogged}} \cdot (1-\color{darkred}\delta_t) \\ & \color{darkred}{\mathit{inv}}_t \le \color{darkred}R + \color{darkblue}{\mathit{invCap}} \cdot (1-\color{darkred}{\mathit{low}}_t ) && (3) \\ &\color{darkred}{\mathit{inv}}_t \ge \color{darkred}R + 1 - (\color{darkblue}{\mathit{invCap}}+1)\cdot \color{darkred}{\mathit{low}}_t \\ & \color{darkred}{\mathit{order}}_t \le 1-\color{darkred}{\mathit{low}}_{t-1} && (4) \\ &\color{darkred}{\mathit{order}}_t \le \color{darkred}{\mathit{low}}_t \\ & \color{darkred}{\mathit{order}}_t \ge \color{darkred}{\mathit{low}}_t -\color{darkred}{\mathit{low}}_{t-1} \\ & \color{darkred}{\mathit{repl}}_t \le \color{darkblue}{\mathit{maxQ}} \cdot \color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}} && (5) \\ & \color{darkred}{\mathit{repl}}_t \le \color{darkred}Q \\ & \color{darkred}{\mathit{repl}}_t \ge \color{darkred}Q - \color{darkblue}{\mathit{maxQ}}\cdot (1-\color{darkred}{\mathit{order}}_{t-\color{darkblue}{\mathit{leadTime}}}) \\ & \color{darkred}Q \in [0,\color{darkblue}{\mathit{maxQ}}] \\ & \color{darkred}R \in [0,\color{darkblue}{\mathit{invCap}}] \\ & \color{darkred}{\mathit{inv}}_t \in [0,\color{darkblue}{\mathit{invCap}}] \\ & \color{darkred}{\mathit{back}}_t \in [0,\color{darkblue}{\mathit{maxBackLogged}}] \\ & \color{darkred}\delta_t \in \{0,1\} \\ & \color{darkred}{\mathit{low}}_t \in \{0,1\} \\ & \color{darkred}{\mathit{order}}_t \in \{0,1\} \\ & \color{darkred}{\mathit{repl}}_t \in [0,\color{darkblue}{\mathit{maxQ}}] \end{align}\]

In my experiment, all decision variables were binary or integer variables. This is optional: we can use continuous variables for $\color{darkred}Q$ and $\color{darkred}R$ and related variables, if this is more appropriate.

Results

Detailed results can look like:

----    163 PARAMETER result  

                 demand         inv     backlog       below     reorder   replenish

t1                   17         483
t2                   84         399
t3                   55         344                       1           1
t4                   30         314                       1
t5                   29         285                       1
t6                   22         263                       1
t7                   34         229                       1
t8                   85         144                       1
t9                    6         138                       1
t10                  50          88                       1
t11                  99                      11           1
t12                  57                      68           1
t13                  99         345                                             512
t14                  76         269                       1           1
t15                  13         256                       1
t16                  63         193                       1
t17                  15         178                       1
t18                  25         153                       1
t19                  66          87                       1
t20                  43          44                       1
t21                  35           9                       1
t22                  35                      26           1
t23                  13                      39           1
t24                  15         458                                             512
t25                  58         400
t26                  83         317                       1           1
t27                  23         294                       1
t28                  66         228                       1
t29                  77         151                       1
t30                  30         121                       1
t31                  11         110                       1
t32                  50          60                       1
t33                  16          44                       1
t34                  87                      43           1
t35                  26                      69           1
t36                  28         415                                             512
t37                  59         356
t38                  72         284                       1           1
t39                  62         222                       1
t40                  46         176                       1
t41                  41         135                       1
t42                  11         124                       1
t43                  31          93                       1
t44                   4          89                       1
t45                  33          56                       1
t46                  18          38                       1
t47                  64                      26           1
t48                  56         430                                             512
t49                  76         354
t50                  29         325                       1           1
t51                  66         259                       1
t52                  75         184                       1
t53                  62         122                       1
t54                  28          94                       1
t55                   8          86                       1
t56                  10          76                       1
t57                  64          12                       1
t58                  54                      42           1
t59                   3                      45           1
t60                  79         388                                             512
t61                   7         381
t62                  17         364
t63                  52         312                       1           1
t64                  75         237                       1
t65                  17         220                       1
t66                   3         217                       1
t67                  58         159                       1
t68                  62          97                       1
t69                  38          59                       1
t70                  35          24                       1
t71                  24                                   1
t72                  24                      24           1
t73                  13         475                                             512
t74                  93         382
t75                  37         345
t76                  78         267                       1           1
t77                  30         237                       1
t78                  12         225                       1
t79                  74         151                       1
t80                   6         145                       1
t81                  20         125                       1
t82                             125                       1
t83                  26          99                       1
t84                  49          50                       1
t85                  15          35                       1
t86                  17         530                                             512
t87                  33         497
t88                  31         466
t89                  32         434
t90                  96         338                       1           1
t91                  99         239                       1
t92                  36         203                       1
t93                  37         166                       1
t94                  77          89                       1
t95                  39          50                       1
t96                  91                      41           1
t97                  11                      52           1
t98                  73                     125           1
t99                   5                     130           1
t100                 57         325                       1                     512
Q                   512
R                   344
numorder              8
holdcost          37580
ordercost          4000
backlogcost        7410

A picture of this:

This looks reasonable to me.

Note that in our model the reorder point $\color{darkred}R$ (straight light-blue line) can not be higher than the inventory after replenishment (see point 4 above). The first replenishment is exactly demonstrating this bound. Of course, we can refine the model to relax this requirement.

Conclusion

With some effort, we can form a linear MIP model that determines the optimal values for $(\color{darkred}Q, \color{darkred}R)$ for any (simulated) demand pattern. We needed a few modeling tricks to formulate this problem as a proper linear MIP model, however. This requires a number of big-M constraints. Using indicator constraints and/or quadratic terms (available in some advanced solvers) can make the model simpler. Our model does not make assumptions about special facilities available in the MIP solver, so this model should work with any (good) MIP solver.

References

Minimize an Integer Programming problem with nested decision variables, https://stackoverflow.com/questions/64652771/minimize-an-integer-programming-problem-with-nested-decision-variables

Decision variables
\(\color{darkred}Q \in [0,\color{darkblue}{\mathit{maxQ}}]\)	Reorder quantity. Each time we place an order, this is the size of the order.
\(\color{darkred}R \in [0,\color{darkblue}{\mathit{invCap}}]\)	Reorder point. We place an order when the inventory drops below \(\color{darkred}R\).
\(\color{darkred}{\mathit{inv}}_t \in [0,\color{darkblue}{\mathit{invCap}}] \)	Inventory at end of period \(t\).
\(\color{darkred}{\mathit{back}}_t \in [0,\color{darkblue}{\mathit{maxBackLogged}}]\)	Backlog, unmet demand we fulfill once new inventory arrives, again at end of period \(t\).
\(\color{darkred}{\mathit{low}}_t \in \{0,1\}\)	Inventory is low, defined by \(\color{darkred}{\mathit{inv}}_t \le \color{darkred}R\).
\(\color{darkred}{\mathit{order}}_t \in \{0,1\}\)	An order is placed. We place an order when \(\color{darkred}{\mathit{inv}}_t\) drops below \(\color{darkred}R\).
\(\color{darkred}{\mathit{repl}}_t \in \{0,\color{darkred}Q\}\)	Replenishment. Inventory is replenished by a previously placed order (there is a lead time). This replenishment is also used to handle backlogged demand. \(\color{darkred}{\mathit{repl}}_t\) is either zero or equal to the reorder quantity \(\color{darkred}Q\).