# Yet Another Math Programming Consultant

I am a full-time consultant and provide services related to the design, implementation and deployment of mathematical programming, optimization and data-science applications. I also teach courses and workshops. Usually I cannot blog about projects I am doing, but there are many technical notes I'd like to share. Not in the least so I have an easy way to search and find them again myself. You can reach me at erwin@amsterdamoptimization.com.

## Saturday, June 12, 2021

### Median, quantiles and quantile regression as linear programming problems

## Thursday, June 10, 2021

### Automatic differentiation and Google

(The URL seems to have a confusing name: it is about automatic differentiation).

## Tuesday, June 8, 2021

### Portfolio optimization with risk as constraint?

The standard mean-variance portfolio optimization models have the form:

Model M1 |
---|

\[\begin{align}\min\>&\color{darkred}{\mathit{Risk}}=\color{darkred}x^T\color{darkblue}Q\color{darkred}x\\ & \color{darkblue}r^T\color{darkred}x \ge \color{darkblue}{\mathit{MinimumReturn}} \\ &\sum_i \color{darkred}x_i = 1\\ & \color{darkred}x \ge 0\end{align}\] |

or may be:

Model M2 |
---|

\[\begin{align}\min\>&\color{darkred}z=\color{darkred}x^T\color{darkblue}Q\color{darkred}x - \color{darkblue}\lambda \cdot\color{darkblue}r^T\color{darkred}x\\ &\sum_i \color{darkred}x_i = 1\\ & \color{darkred}x \ge 0\end{align}\] |

where

- \(\color{darkblue}Q\) is a variance-covariance matrix (we assume here it is positive semi-definite [1]),
- \(\color{darkblue}\lambda\) is an exogenous constant that sometimes is varied to draw an efficient frontier, and
- \(\color{darkblue}r\) are the returns.

## Wednesday, June 2, 2021

### Total Least Squares: nonconvex optimization

**Total Least Squares** (TLS) is an alternative for OLS (**Ordinary Least Squares**). It is a form of **orthogonal regression** and also deals with the problem of EIV (**Errors-in-Variables**).

The standard OLS model is \[\color{darkblue}y = \color{darkblue}X\color{darkred}\beta + \color{darkred}\varepsilon\] where we minimize the sum-of-squares of the residuals \[\min ||\color{darkred}\varepsilon||_2^2\] We can interpret \(\color{darkred}\varepsilon\) as the error in \(\color{darkblue}y\).

In TLS, we also allow for errors in \(\color{darkblue}X\). The model becomes \[\color{darkblue}y+\color{darkred}\varepsilon=(\color{darkblue}X+\color{darkred}E)\color{darkred}\beta\] Note that we made a sign change in \(\color{darkred}\varepsilon\). This is pure aesthetics: to make the equation more symmetric looking. The objective is specified as \[\min \> ||\left(\color{darkred}\varepsilon \> \color{darkred}E\right)||_F\] i.e. the Frobenius norm of the matrix formed by \(\color{darkred}\varepsilon\) and \(\color{darkred}E\). The Frobenius norm is just \[||A||_F=\sqrt{\sum_{i,j}a_{i,j}^2}\] We can drop the square root from the objective (the solution will remain the same, but we got rid of a non-linear function with a possible problem near zero: the gradient is not defined there). The remaining problem is a non-convex quadratic problem which can be solved with global MINLP solvers such as Baron or with a global quadratic solver like Gurobi.

## Sunday, May 23, 2021

### Solving linear complementarity problems without an LCP solver

#### Introduction

LCP Problem |
---|

\[\begin{align} &\text{Find $\color{darkred}x\ge 0,\color{darkred}w\ge 0$ such that:}\\ & \color{darkred}x^T\color{darkred}w = 0 \\ &\color{darkred}w =\color{darkblue}M \color{darkred}x + \color{darkblue}q\end{align}\] |

## Monday, May 10, 2021

### Clustering models

In Statistics (nowadays called Data Science or A.I. for public relations reasons), clustering is one of the most popular techniques available. Of course, nothing beats linear regression in the popularity contest. Here, I like to discuss two clustering models: **\(k\)-means **and **\(k\)-medoids**. For these models, there exist well-defined equivalent **Mixed-Integer Programming** models. In practice, they don't work very well except for small data sets. I think they are still useful to discuss for different reasons:

- The formulations are interesting. They have some angles that may not be obvious at first.
- They define the problem in a succinct way. Verbal descriptions are always fraught with imprecision and vagueness. A reference model can help to make things explicit. I.e. use a model as a documentation tool.
- Not all data sets are large. For small data sets, we can prove optimality where the usual heuristics only can deliver good solutions, without much information about the quality of the solution. Obviously, clustering is often used in situations where ultimate optimality may not matter much, as it is frequently used as an exploratory tool.
- We can adapt the model to special cases. Adding constraints such as a minimum and maximum number of points per cluster comes to mind [3].

## Thursday, April 29, 2021

### What is this BRATIO option in GAMS?

This is a note about BRATIO, a very exotic GAMS option related to providing an advanced basis to a Simplex-based LP (or NLP) solver.

## Tuesday, April 20, 2021

### Inverting a matrix by LP

## Monday, April 19, 2021

### Parallel machine scheduling II, three more formulations

In [1] two formulations were discussed for a **scheduling problem with multiple machines**. Here we add a few more. Some of them are a bit strange. None of them really works better than the ones in [1].

So this is a collection of formulations that may sound reasonable at first, but are really not suited for this problem. If you want to read about good models, skip this post.

## Monday, April 5, 2021

### This is really a modeling issue

In [1], the following scenario is described:

- A MIP model is solved, delivering a
**feasible**solution. - All integer variables are fixed to their levels.
- The resulting LP model is solved in order to produce duals. This model turns out to be
**infeasible**.