Monday, April 17, 2017

MIP model in R using the OMPR package

A small model with data in R is a good opportunity to try out the OMPR modeling tool (2). Also a first time for me to see the solver Symphony (3) in action.  

Problem Description

From (1) (see references below):

I have 4 stores (1,2,3,4) and I can apply 3 treatments (A,B,C) to each of the 4 stores. Each treatment has its own cost and profit.

The matrix is as follows:

Store   Treatment   Cost    Profit
1   A   50  100
1   B   100 200
1   C   75  50
2   A   25  25
2   B   150 0
2   C   50  25
3   A   100 300
3   B   125 250
3   C   75  275
4   A   25  25
4   B   50  75
4   C   75  125

How can I maximize the profit having a constraint on maximum cost in R? In reality, I have a large number of stores and hence cannot be done manually. Each store can get only 1 treatment.

Thanks in advance.

Optimization Model

The optimization model for this problem can look like:

image

Loading the data

Loading the data into a data frame is easy with read.table:

df<-read.table(text="
Store   Treatment   Cost    Profit
1   A   50  100
1   B   100 200
1   C   75  50
2   A   25  25
2   B   150 0
2   C   50  25
3   A   100 300
3   B   125 250
3   C   75  275
4   A   25  25
4   B   50  75
4   C   75  125
",header=T)
df
##    Store Treatment Cost Profit
## 1      1         A   50    100
## 2      1         B  100    200
## 3      1         C   75     50
## 4      2         A   25     25
## 5      2         B  150      0
## 6      2         C   50     25
## 7      3         A  100    300
## 8      3         B  125    250
## 9      3         C   75    275
## 10     4         A   25     25
## 11     4         B   50     75
## 12     4         C   75    125

OMPR Model

I am going to try to model this problem with OMPR (2) and solve it with the Symphony MIP solver (3).

First we need a bunch of packages:

library(dplyr)
library(tidyr)
library(ROI)
library(ROI.plugin.symphony)
library(ompr)
library(ompr.roi)

Extract data

OMPR is using numbers as indices so it makes sense to extract the cost and profit data as matrices. We do this as follows:

# extract some basic data
stores<-unique(df$Store)
stores
## [1] 1 2 3 4
treatments <- levels(df$Treatment)
treatments
## [1] "A" "B" "C"
num_treatments <- length(treatments)
# cost matrix
cost <- as.matrix(spread(subset(df,select=c(Store,Treatment,Cost)),Treatment,Cost)[,-1])
cost
##     A   B  C
## 1  50 100 75
## 2  25 150 50
## 3 100 125 75
## 4  25  50 75
# profit matrix
profit <- as.matrix(spread(subset(df,select=c(Store,Treatment,Profit)),Treatment,Profit)[,-1])
profit
##     A   B   C
## 1 100 200  50
## 2  25   0  25
## 3 300 250 275
## 4  25  75 125
# maximum cost allowed
max_cost <- 300

MIP Model

The OMPR package allows us to specify linear equations algebraically. Most LP/MIP solvers in R use a matrix notation, which often is more difficult to use. Here we go:

m <- MIPModel() %>%
  add_variable(x[i,j], i=stores, j=1:num_treatments, type="binary") %>%
  add_constraint(sum_expr(x[i,j], j=1:num_treatments)<=1, i=stores) %>%
  add_constraint(sum_expr(cost[i,j]*x[i,j], i=stores, j=1:num_treatments) <= max_cost) %>%
  set_objective(sum_expr(profit[i,j]*x[i,j], i=stores, j=1:num_treatments),"max") %>%
  solve_model(with_ROI(solver = "symphony",verbosity=1))
## 
## Starting Preprocessing...
## Preprocessing finished...
##       with no modifications...
## Problem has 
##   5 constraints 
##   12 variables 
##   24 nonzero coefficients
## 
## Total Presolve Time: 0.000000...
## 
## Solving...
## 
## granularity set at 25.000000
## solving root lp relaxation
## The LP value is: -650.000 [0,5]
## 
## 
## ****** Found Better Feasible Solution !
## ****** Cost: -650.000000
## 
## 
## ****************************************************
## * Optimal Solution Found                           *
## * Now displaying stats and best solution found...  *
## ****************************************************
## 
## ====================== Misc Timing =========================
##   Problem IO        0.000
## ======================= CP Timing ===========================
##   Cut Pool                  0.000
## ====================== LP/CG Timing =========================
##   LP Solution Time          0.000
##   LP Setup Time             0.000
##   Variable Fixing           0.000
##   Pricing                   0.000
##   Strong Branching          0.000
##   Separation                0.000
##   Primal Heuristics         0.000
##   Communication             0.000
##   Total User Time              0.000
##   Total Wallclock Time         0.000
## 
## ====================== Statistics =========================
## Number of created nodes :       1
## Number of analyzed nodes:       1
## Depth of tree:                  0
## Size of the tree:               1
## Number of solutions found:      1
## Number of solutions in pool:    1
## Number of Chains:               1
## Number of Diving Halts:         0
## Number of cuts in cut pool:     0
## Lower Bound in Root:            -650.000
## 
## ======================= LP Solver =========================
## Number of times LP solver called:                 1
## Number of calls from feasibility pump:            0
## Number of calls from strong branching:            0
## Number of solutions found by LP solve:            1
## Number of bounds changed by strong branching:     0
## Number of nodes pruned by strong branching:       0
## Number of bounds changed by branching presolver:  0
## Number of nodes pruned by branching presolver:    0
## 
## ==================== Primal Heuristics ====================
##                              Time      #Called   #Solutions
## Rounding I                   0.00                           
## Rounding II                  0.00                           
## Diving                       0.00 
## Feasibility Pump             0.00 
## Local Search                 0.00            1            0 
## Restricted Search            0.00 
## Rins Search                  0.00 
## Local Branching              0.00 
## 
## =========================== Cuts ==========================
## Accepted:                         0
## Added to LPs:                     0
## Deleted from LPs:                 0
## Removed because of bad coeffs:    0
## Removed because of duplicacy:     0
## Insufficiently violated:          0
## In root:                          0
## 
## Time in cut generation:              0.00
## Time in checking quality and adding: 0.00
## 
##                    Time     #Called     In Root       Total
## Gomory             0.00 
## Knapsack           0.00 
## Clique             0.00 
## Probing            0.00 
## Flowcover          0.00 
## Twomir             0.00 
## Oddhole            0.00 
## Mir                0.00 
## Rounding           0.00 
## LandP-I            0.00 
## LandP-II           0.00 
## Redsplit           0.00 
## 
## ===========================================================
## Solution Found: Node 0, Level 0
## Solution Cost: -650.0000000000
## +++++++++++++++++++++++++++++++++++++++++++++++++++
## User indices and values of nonzeros in the solution
## +++++++++++++++++++++++++++++++++++++++++++++++++++
##       1 1.0000000000
##       2 1.0000000000
##       4 1.0000000000
##      11 1.0000000000    
## 

Solution

A “raw" solution can be generated using:

get_solution(m,x[i, j]) %>%
  filter(value > 0)
##   variable i j value
## 1        x 2 1     1
## 2        x 3 1     1
## 3        x 1 2     1
## 4        x 4 3     1

We want to clean this up a bit:

cat("Status:",solver_status(m),"\n")
cat("Objective:",objective_value(m),"\n")
get_solution(m,x[i, j]) %>%
  filter(value > 0) %>%
  mutate(Treatment = treatments[j],Store = i) %>%
  select(Store,Treatment)
## Status: optimal 
## Objective: 650 
##   Store Treatment
## 1     2         A
## 2     3         A
## 3     1         B
## 4     4         C
References
  1. The original problem is from: Mixed linear integer optimization - Optimize Profit based on different treatments to all the stores in R, http://stackoverflow.com/questions/43446194/optimize-profit-based-on-different-treatments-to-all-the-stores-in-r
  2. Dirk Schumacher, OMPR: R package to model Mixed Integer Linear Programs, https://github.com/dirkschumacher/ompr
  3. Symphony MIP Solver: https://projects.coin-or.org/SYMPHONY
  4. Paul Rubin, MIP Models in R with OMPR, http://orinanobworld.blogspot.com/2016/11/mip-models-in-r-with-ompr.html

Thursday, April 13, 2017

QP as LP: cutting planes

In (1) I described a simple linearization scheme for a QP model we we can solve it as a straight LP. For a simple (convex) quadratic function \(f(x)\), instead of solving:

\[ \min\> f(x)=x^2\]

we solve

\[\begin{align} \min\>&z\\
                              &z \ge a_i x + b_i\end{align}\]

In this post I do things slightly different: instead of adding the linear inequalities ahead of time, we add them one at the time based on the previously found optimal point. This approach is called a cutting plane technique (2).

Example: portfolio model

We consider again the simple portfolio model:

\[\boxed{\begin{align} \min \>& \sum_t w_t^2\\
                              & w_t = \sum_i r’_{i,t} x_i\\
                              & \sum_i \mu_i x_i \ge R\\
                              & \sum_i x_i = 1\\
                              &x_i\ge 0\end{align}}
\]

The model is initially linearized as an LP model:

\[\boxed{\begin{align}\min \>& \sum_t z_t\\
                              & w_t = \sum_i r’_{i,t} x_i\\
                              & \sum_i \mu_i x_i \ge R\\
                              & \sum_i x_i = 1\\
                              &x_i, z_t\ge 0\end{align}}
\]

Later on, during the CP algorithm, we will add linear cuts of the form

\[z_t \ge a_t w_t + b_t\]

The algorithm would look like:

  1. \(k:=1\)
  2. Solve model LP, let \(w^*_t\) be the optimal values for \(w_t\).
  3. if \(k=\)MAXIT: STOP
  4. Add the constraint \(z_t \ge a_t w_t + b_t\) where \(a_t=2 w^*_t\) and \(b_t=-(w^*_t)^2\). Note that we add one cut for each \(w_t\) here (our dataset has \(T=717\) time periods). 
  5. \(k:=k+1\)
  6. go to step 2

Here we stop simply when a certain number of iterations MAXIT has been exceeded. That can be refined by stopping when the objective does not seem to change much anymore. Another optimization could be to only add cuts that are different from the ones added before (for some \(t\) we may converge quicker than for others).

The algorithm converges very fast:

----    118 PARAMETER report2 

           obj(LP)         w'w

iter1               1.02907546
iter2   0.21577087  0.46879016
iter3   0.33210537  0.48432099
iter4   0.37143835  0.41397603
iter5   0.40990988  0.41362293
iter6   0.41109694  0.41222051
iter7   0.41183602  0.41212331
iter8   0.41192720  0.41204267
iter9   0.41196991  0.41204112
iter10  0.41197620  0.41203782

Note that the optimal QP solution has an objective: 0.41202816.

This is pretty good performance especially because the dataset is not very small (it has 717 time periods and 83 stocks).

Here is a picture of the cuts introduced for the first element \(z_1=w_1^2\):

animation

A combined approach

We can even combine the two methods:

  1. Start with a coarse-grained (i.e. cheap) initial set of cuts. In (1) we used \(n=10\) inequalities per \(w_t\). For this experiment I reduced this to \(n=5\).
  2. Then apply our cutting plane algorithm. Instead of MAXIT=10 iterations we now do 5 iterations.

This yields even faster convergence:

----    142 PARAMETER report3 

          obj(LP)         w'w

iter1  0.32921509  0.41401219
iter2  0.40812966  0.41471624
iter3  0.41017228  0.41211351
iter4  0.41188869  0.41208158
iter5  0.41195624  0.41203588

References
  1. QP as LP: piecewise linear functions, http://yetanothermathprogrammingconsultant.blogspot.com/2017/04/qp-as-lp-piecewise-linear-functions.html
  2. J. E. Kelley, Jr. "The Cutting-Plane Method for Solving Convex Programs." J. Soc. Indust. Appl. Math. Vol. 8, No. 4, pp. 703-712, December, 1960.
  3. Cardinality Constrained Portfolio Optimization: MIQP without MIQP Solver, http://yetanothermathprogrammingconsultant.blogspot.com/2016/02/cardinality-constrained-portfolio.html. Here this cutting plane method is applied to a MIQP model (not strictly “allowed” as this is no longer convex, but useful as a heuristic).

QP as LP: piecewise linear functions

I am facing a (convex) QP problem that has some numerical issues for some data sets. Hence as backup I’d like to be able to solve the problem as an LP. One way to do this is using a simple piecewise linear approximation.

Consider the objective function \(\min\> f(x)=x^2\). Instead of solving this directly we can invent a few linear functions \(y = a_i x + b_i\) and approximate the problem by:

\[\begin{align} \min\>&z\\
                             &z \ge a_i x + b_i\end{align}\]

Here is an example how these linear functions can look like:

image

To derive a single linear function we observe the following:

  1. Pick a point \(t_i\), e.g. \(t_i=2\) in the picture below.
  2. We want the function value to match: \(f(t_i) = a_i t_i + b_i\). I.e. \(a_i t_i + b_i=t_i^2\).
  3. We also want the slope to match: \(\nabla f(t_i) = a_i\). I.e. \(a_i = 2 t_i\).

image

This means we can choose:

\[\begin{align} &a_i = 2 t_i\\
                     &b_i = –t_i^2\end{align}\]

The net effect of this strategy is that we formed a piecewise linear objective that approximates our original quadratic function:

image

Note that there are other linear approximation schemes. This is just a particularly simple one.

Example: portfolio optimization

To try this out on a more realistic problem, we consider a standard portfolio optimization problem:

\[\boxed{\begin{align} \min \>& \sum_t w_t^2\\
                              & w_t = \sum_i r’_{i,t} x_i\\
                              & \sum_i \mu_i x_i \ge R\\
                              & \sum_i x_i = 1\\
                              &x_i\ge 0\end{align}}
\]

Here \(r’=r-\mu\) are the mean adjusted returns.\(R\) is the minimum expected portfolio return.

In this case our linear model can look like:

\[\boxed{\begin{align} \min \>& \sum_t z_t\\
                              & z_t \ge a_{t,j} w_t + b_{t,j}\>\forall t,j\\
                              & w_t = \sum_i r’_{i,t} x_i\\
                              & \sum_i \mu_i x_i \ge R\\
                              & \sum_i x_i = 1\\
                              &x_i\ge 0\end{align}}
\]

The \(w_t\)’s vary from \(L_t=\displaystyle\min_i r’_{i,t}\) to \(U_t=\displaystyle\max_i r’_{i,t}\). For each \(w_t\) we choose \(n=10\) equidistant points between \([L_t,U_t]\). We use the FTSE100 dataset from (1), which has:

  • number of time periods \(t\): 717
  • number of stocks \(i\): 83

We see the following results:

  • The optimal QP objective is 0.41202816
  • The optimal LP objective is 0.37728185

This looks bad, but actually things are better than they appear. When we evaluate \(\displaystyle\sum_i w_t^2\) for the optimal LP values of \(w_t\) we have:

  • QP objective of LP solution: 0.41246149

Here are the differences in the selected portfolio:

----     87 PARAMETER report 

QP objective 0.41202816,    LP objective 0.37728185,    w'w from LP  0.41246149


----     87 PARAMETER xval 

                 QP          LP

stock2   0.11723384  0.10298780
stock11  0.13629495  0.14992714
stock22  0.06253534  0.06202882
stock40  0.09082870  0.10523583
stock64  0.08521849  0.08348895
stock66  0.17247021  0.17267818
stock69  0.00180266
stock78  0.09462530  0.09761814
stock79  0.05184550  0.04383311
stock83  0.18714501  0.18220203

Performance

Although the LP model is much larger, it solves in about the same time as the original QP. I see:

image

Note that barrier iterations should really not be compared to simplex iterations. A finer or more coarse grid in the piecewise linear approximation has only affect on the the number of equations in the model (the number of variables will stay the same).

References
  1. Renato Bruni, Francesco Cesarone, Andrea Scozzari, Fabio Tardella, Real-world datasets for portfolio selection and solutions of some stochastic dominance portfolio models, Data in Brief, Volume 8, September 2016, Pages 858-862
  2. QP as LP: cutting planes, http://yetanothermathprogrammingconsultant.blogspot.com/2017/04/qp-as-lp-cutting-planes.html

Wednesday, April 5, 2017

MIP Modeling: preventing a bit-pattern

From (1):

Let us say we have \(n\) binary variables \(x_i\) for all \(i=1,2,…,n\) i.e., \(x_i \in \{0,1\}\) for all \(i=1,2,…,n\).

I need to write the following constraint:

  • If \(x_i=1\) and \(x_{i+2}=1\), then \(x_{i+1}=1\). In other words, the variables \(x_i\) must be consecutives.

Basically we want to prevent the sequence \([1,0,1]\) but allow all other patterns. This can be accomplished with the constraint:

\[\boxed{x_i-x_{i+1}+x_{i+2} \le 1}\]

This is actually a special case of (2). Of course for this small example we can inspect all possible combinations (there are \(2^3=8\) of them) to verify we do the right thing:

image

Counting Start-ups

What about if you want to have all your \(x_i=1\) consecutive. I.e. we only allow something like \([0,0,1,1,1,1,0,0,0]\) where there are no holes in the series of ones. This can be best modeled by counting the number of times we go from \(0\) to \(1\). A compact formulation is:

\[\boxed{\begin{align}&s_i \ge x_i – x_{i-1}\\
                     &\sum_i s_i \le 1\\
                     &s_i \in \{0,1\}\end{align}}\]

Here we force \(s_i=1\) if we go from \(0\) to \(1\). Note that we can relax \(s_i\) to be continuous (i.e. \(s_i \in [0,1]\)).

This example would show:

\[\begin{array}{cccccccccc}x_i=&[0&0&1&1&1&1&0&0&0]\\
                                        s_i=&[0&0&1&0&0&0&0&0&0]\end{array}\]
References
  1. How to model a consecutive binary constraint? http://math.stackexchange.com/questions/2218389/how-to-model-a-consecutive-binary-constraint
  2. Integer Cuts, http://yetanothermathprogrammingconsultant.blogspot.com/2011/10/integer-cuts.html
  3. Modeling number of machine start-ups, http://yetanothermathprogrammingconsultant.blogspot.com/2014/12/modeling-number-of-machine-start-ups.html

Tuesday, April 4, 2017

Satalia SolveEngine

From the website (1), I am not sure what this is, although the soundbite is intriguing:

image

In (2) there is more information:

The SolveEngine is a cloud-based system, similar to NEOS except with more powerful solvers and much shorter queue times.

Note that NEOS has all the major commercial LP/MIP solvers: Cplex, Gurobi, Xpress and Mosek. I am curious what these more powerful solvers are.

References
  1. Satalia SolverEngine, https://solve.satalia.com
  2. OpenSolver for Google Sheets 2.3.0 (2 April 2017), https://opensolver.org/opensolver-for-google-sheets-2-3-0-2-april-2017/