Yet Another Math Programming Consultant: 2021

Tuesday, December 28, 2021

Importing JSON files into GAMS

JSON (JavaScript Object Notation) files [1] provide a flexible and simple way to store and exchange data. It is text, so we can easily look at it. JSON files typically have the extension .json. (using it may help syntax coloring in editors). JSON can represent rather complex tree-like data structures that may require some work to convert into simpler formats such as tabular data.

EIA (U.S. Energy Information Administration) provides bulk data in JSON format [2]. These files are not strictly JSON. They have a .txt extension. Basically, each line is a JSON object.

To make it a proper JSON file one would have to wrap this into an array.

MacMahon Squares

Percy Alexander MacMahon [1]

This post is about a puzzle designed by Percy Alexander MacMahon [1] and described in a book published in 1921 [2].

Solve status: not as easy as it looks

In [1] a question was asked about a solve status in CVXR when we hit a time limit. Hopefully, we obtain the best solution found so far. But do we also get some indication about the status of this solution? CVXR only has the following status codes [2]:

status The status of the solution. Can be "optimal", "optimal_inaccurate", "infeasible", "infeasible_inaccurate", "unbounded", "unbounded_inaccurate", or "solver_error".

That looks wholly inadequate to properly describe this situation.

Shortest path in GAMS

Here is a presentation for a GAMS workshop. We focus on the Shortest Path LP problem. I use this model to discuss a lot of different things:

Modeling networks in GAMS
Sparse data structures in GAMS
Random number generation
Python and GAMS
Put facility
Visualization (Python and HTML/Javascript based)

Example Visualization of a Shortest Path in a Network

Transportation Model (sharing a PowerPoint presentation)

trnsport.gms is the first model in the GAMS model library. Here I am using it to give an introduction in GAMS. The PowerPoint presentation has three major sections:

the historical importance of the transportation model in economics (Nobel prizes have been earned),
a quick introduction to the GAMS language and the GAMS listing file,
a nerdy section (optional) about recognizing a basis and an optimal solution in the listing file.

A multi-objective nonlinear integer programming problem

Picture of the efficient points

In [1], a small multi-objective non-convex MINLP problem is stated:

Let's see how one could approach a problem like this.

Stacking 3d boxes under rotation

Boston custom house tower

Assume we have an infinite collection of 3d boxes of different sizes. As an example let's use [1]:

----     30 PARAMETER boxes  available box sizes

          size-x      size-y      size-z

box1           4           6           7
box2           1           2           3
box3           4           5           6
box4          10          12          32

We want to build a tower of boxes such that the base of the box on top is strictly smaller than the underlying box. The "strictly" part is essential. Otherwise, we can just repeat the same type of box ad infinitum. The interesting angle is: we can rotate the boxes. Without loss of generality, we can assume that for all boxes, after rotation, we have: the width ($x$) is less than the depth $(y$). (This may require a bit of thinking). With this, any box can be placed in just three different ways.

3d rotations

The objective is to build a tower that is as tall as possible. In [1], this problem is solved using a Dynamic Programming algorithm. Here I want to see how this looks using a MIP model.

2d knapsack problem

This post is about what sometimes is called a 2d knapsack problem. It can also be considered as a bin-packing problem. These are famously difficult problems, so let's have a look at a relatively small instance.

So, we have one 2d container of a given size. Furthermore, we have a collection of items: rectangles of a given size, with a value. The goal is to pack items in our container without overlapping each other such that the total value is maximized.

To make things concrete, let's have a look at a small data set (randomly generated):

----     18 PARAMETER data  problem data

                width      height   available       value  value/size

k1             20.000       4.000       2.000     338.984       4.237
k2             12.000      17.000       6.000     849.246       4.163
k3             20.000      12.000       2.000     524.022       2.183
k4             16.000       7.000       9.000     263.303       2.351
k5              3.000       6.000       3.000     113.436       6.302
k6             13.000       5.000       3.000     551.072       8.478
k7              4.000       7.000       6.000      86.166       3.077
k8              6.000      18.000       8.000     755.094       6.992
k9             14.000       2.000       7.000     223.516       7.983
k10             9.000      11.000       5.000     369.560       3.733
container      30.000      20.000

Another Rook Tour on the Chessboard

In [1] the following problem is described:

The chessboard shown above has $8\times 8 = 64$ cells.
Place numbers 1 through 64 in the cells such that the next number is either directly left, right, below, or above the current number.
The dark cells must be occupied by prime numbers. Note there are 18 dark cells, and 18 primes in the sequence $1,\dots,64$. These prime numbers are: 2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61.

This problem looks very much like the Numbrix puzzle [3]. We don't have the given cells that are provided in Numbrix, but instead, we have some cells that need to be covered by prime numbers.

The path formed by the consecutive values looks like how a rook moves on a chessboard. Which explains the title of this post (and of the original post). If we require a proper tour we need values 1 and 64 to be neighbors. Otherwise, if this is not required, I think a better term is: Hamiltonian path. I'll discuss both cases, starting with the Hamiltonian path problem.

In [1] the following problem was posted.

We have a number of legs with their lengths (in miles). E.g.

legs = [3.3, 4.2, 5.2, 3, 2.7, 4, 
    5.3, 4.5, 3, 5.8, 3.3, 4.9, 
    3.1, 3.2, 4, 3.5, 4.9, 2.3, 
    3.2, 4.6, 4.5, 4, 5.3, 5.9, 
    2.8, 1.9, 2.1, 3, 2.5, 5.6, 
    1.3, 4.6, 1.5, 1.2, 4.1, 8.1]

We also have a number of runs performed by runners. Again, we have distances in miles:
```
runs = [3.2, 12.3, 5.2, 2.9, 2.9, 5.5]
```
A run can be allocated to different legs, as long a the sum of these legs is less than the length of the run.
We want to maximize the number of legs covered by the runs, but with a wrinkle: the legs should be consecutive: 1,2,3,... (we can't skip a leg).

A solution for this data set can look like:

----     60 PARAMETER report  results

              run1        run2        run3        run4        run5        run6

leg1                       3.3
leg2                       4.2
leg3                                   5.2
leg4           3.0
leg5                                               2.7
leg6                       4.0
leg7                                                                       5.3
total          3.0        11.5         5.2         2.7                     5.3
runLen         3.2        12.3         5.2         2.9         2.9         5.5

It is noted that solutions are not necessarily unique. In this case, we could have swapped run 4 for run 5 to cover leg 5.

Reallocate people: a very small but interesting LP

U-Haul Moving Van [link]

This is a strange little problem [1]. We have individuals (of different categories) and regions with a given capacity. The initial set of assigned persons fits within the capacity of each region. Now a new batch of individuals arrives with their chosen zones. Unfortunately, the total demand (old + new individuals) does not fit anymore in all cases. This means we need to reallocate people. We can do this at a price (the costs are given). What is the optimal (least cost) new allocation?

Huber regression: different formulations

Peter Huber, Swiss statistician

Traditionally, regression, well-known and widely used, is based on a least-squares objective: we have quadratic costs. A disadvantage of this approach is that large deviations (say due to outliers) get a very large penalty. As a result, these outliers can affect the results dramatically.

One alternative regression method that tries to remedy this is L1- or LAD-regression (LAD=Least Absolute Deviation) [1]. Here I want to discuss a method that is somewhere in the middle between L1 regression and least-squares: Huber or M-Regression [2]. Huber regression uses a loss function that is quadratic for small deviations and linear for large ones. It is defined in a piecewise fashion. The regression can look like the following non-linear optimization problem:

NLP Model
\[\begin{align} \min_{\color{darkred}\beta,\color{darkred}\varepsilon} &\sum_i \rho(\color{darkred}\varepsilon_i) \\ &\color{darkblue}y_i = \sum_j \color{darkblue}x_{i,j}\color{darkred}\beta_j +\color{darkred}\varepsilon_i \end{align} \] where \[\rho(\color{darkred}\varepsilon_i) = \begin{cases}\color{darkred}\varepsilon^2_i & \text{if $\|\color{darkred}\varepsilon_i \| \le \color{darkblue}k$} \\ 2\color{darkblue}k \|\color{darkred}\varepsilon_i \|-\color{darkblue}k^2 & \text{otherwise}\end{cases} \]

NLP Model

\[\begin{align} \min_{\color{darkred}\beta,\color{darkred}\varepsilon} &\sum_i \rho(\color{darkred}\varepsilon_i) \\ &\color{darkblue}y_i = \sum_j \color{darkblue}x_{i,j}\color{darkred}\beta_j +\color{darkred}\varepsilon_i \end{align} \] where \[\rho(\color{darkred}\varepsilon_i) = \begin{cases}\color{darkred}\varepsilon^2_i & \text{if $|\color{darkred}\varepsilon_i | \le \color{darkblue}k$} \\ 2\color{darkblue}k |\color{darkred}\varepsilon_i |-\color{darkblue}k^2 & \text{otherwise}\end{cases} \]

matrix operations via GAMS Embedded Python: multi-regional Input-Output tables

Wassily Leontief in front of an IO table, credit:NYU

Inverting a dense matrix

GAMS allows running pieces of Python code as part of a GAMS model. Unfortunately, the interface is rather low-level and inefficient. Usually "low-level" is associated with high-performance, but in the Python world, this is not the case. Here is an example.

Stable Marriage Problem

An inter-cast marriage ceremony in Lalitpur [6]

The Stable Marriage Problem is a fascinating problem. In this problem, we want to assign ("marry") men to women much like the assignment problem. (I am just following the conventions in the literature here.) There is a twist, however. We want to require that the matchings are stable: there does not exist a combination $(m,w)$ such that $m$ and $w$ both prefer each other above their current partner [1].

There are famous algorithms for this problem [2]. But here, I want to look at it as an integer programming problem.

A network model: Pyomo vs GAMS

Network models are an important class of optimization models, in itself but also as part of larger models. In [1] a nice presentation is given on how to set up a network model in the Python-based modeling tool Pyomo.

The simple shortest path problem is reproduced here:

A scheduling problem: a modeling approach

Here is a simple scheduling problem [1]:

We have $T$ time periods (weeks)
and $N$ jobs to be scheduled.
Each job has a duration or processing time (again expressed in weeks), and a given resource (personnel) requirement during the execution of the job. This demand is expressed as: $\color{darkblue}{\mathit{resource}}_{j,t}$ where $j$ indicates the job and $t$ indicates the week since the start of the job.
Not in [1], but I added them to make things a bit more realistic: we have some release dates indicating that a job may not start before a given date. A reason can be that raw material is not available before that date.
Similarly, I added due dates: a job must finish before a given date.
The objective is to minimize the maximum amount of resources we need over the whole planning period. We can think of this as the capacity we need (e.g. number of personnel).

In this post, I want to show a standard approach to model this problem and opposed to that, how I would attack this.

A variant of an assignment problem

In one [1], the following problem is sketched:

We have different types of workers. In the data set below, I call them type-A and type-B workers.
We need to assign workers to jobs.
We have more workers than jobs, so we can execute all jobs.
This is a standard assignment problem: \[\begin{align} \min &\sum_{w,j} \color{darkblue}c_{w,j} \color{darkred}x_{w,j}\\ & \sum_j \color{darkred}x_{w,j} \le 1 &&\forall w \\ & \sum_w \color{darkred}x_{w,j} = 1 && \forall j \\ & \color{darkred}x_{w,j} \in [0,1] \end{align}\]
The complication is that we can form teams of two workers, one of type A and one of type B. Such a team can work faster and execute the job in less time.
How to handle this?

I don't know how to handle this within an assignment problem, but stating this as a MIP allows us to make some progress. Allowing two persons to work on a job is easy. Just replace the second constraint by \[1 \le \sum_w \color{darkred}x_{w,j} \le 2\] Unfortunately this ignores that we need one worker of each type in a team. Also, this makes it difficult to impose a cost for each specific team.

Spatial Equilibrium

Just a quick experiment with a small price-endogenous spatial equilibrium model.

In this problem, we consider commodities (goods) produced and consumed in different regions. We can trade between regions.

This price-endogenous spatial model will find both equilibrium prices, supplies, demand quantities, and trade patterns. Our model will have a connection with the transportation model. The difference is that we formulate the model as a system of complementarity constraints (no objective), with the duals explicitly in the model as variables. Or, in other words, we solve the KKT conditions.

I use a small data set from [3], and see if we can reproduce things.

GAMS Singleton Set Issue

There is a nasty problem with the singleton set in GAMS. A singleton set is a set that can contain only zero or one element. Adding more elements implies replacement. I don't use it very often. Here is a reason why.

Consider the small example:

set i /i1*i20/;
parameter p(i);
p(i) = uniform(0,10);

* find last element >= 5
scalar threshold /5/;
singleton set k(i);
loop(i,
k(i)$(p(i)>=threshold) = yes;
);

display p,k;

I would expect to see as output for k the last element in p(i) that is larger than 5. However, we see:

----     12 PARAMETER p  

i1  1.717,    i2  8.433,    i3  5.504,    i4  3.011,    i5  2.922,    i6  2.241,    i7  3.498,    i8  8.563
i9  0.671,    i10 5.002,    i11 9.981,    i12 5.787,    i13 9.911,    i14 7.623,    i15 1.307,    i16 6.397
i17 1.595,    i18 2.501,    i19 6.689,    i20 4.354


----     12 SET k  

                                                      ( EMPTY )

Another sports schedule problem

Wimbledon 2010 Mixed Doubles (source)

In [1] a question is asked about designing a sports schedule as follows:

There are 16 players, 7 rounds, and 4 games per round.
A game consists of 2 teams with 2 players per team.
Each player has a rating. A team rating is the sum of the ratings of the two members.
A player plays exactly one game in each round.
A player can be partnered with another player only once.
A player can play against another player only twice.
Design a schedule that has the minimum difference in ratings between opposing teams.

I'll formulate two different models. One with a small number of binary variables. And one with a large number of binary variables. As we shall see the model with the larger number of variables will win.

Arranging points on a line

The problem stated in [1] is:

In the original problem, the poster talked about the distance between neighbors. But we don't know in advance what the neighboring points are. Of course, we can just generalize and talk about any two points.

This problem looks deceivingly simple. The modeling has some interesting angles.

We will attack this problem in different ways:

An MINLP problem. The main idea is to get rid of the abs() function as this is non-differentiable.
A MIP model using binary variables or SOS1 variables. Again, we reformulate the abs() function.
Apply a fixing strategy to reduce the model complexity. Good solvers actually don't need this.
A genetic algorithm (ga) approach.

Median, quantiles and quantile regression as linear programming problems

Quantile regression is a bit of an exotic type of regression [1,2,3]. It can be seen as a generalization of $\ell_1$ or LAD regression [4], and just as LAD regression we can formulate and solve it as an LP.

First I want to discuss some preliminaries: how to find the median and the quantiles of a data vector $\color{darkblue}y$. That will give us the tools to formulate the quantile regression problem as an LP. The reason for adding these preliminary steps is to develop some intuition about how Quantile Regression problems are defined. I found that most papers just "define" the underlying optimization problem, without much justification. I hope to show with these small auxiliary models how we arrive at the Quantile Regression model. Along the way, we encounter some interesting titbits. I'll discuss a few details that papers typically glance over or even skip, but I find fascinating.

Automatic differentiation and Google

https://syncedreview.com/2021/06/07/deepmind-podracer-tpu-based-rl-frameworks-deliver-exceptional-performance-at-low-cost-35/

(The URL seems to have a confusing name: it is about automatic differentiation).

Portfolio optimization with risk as constraint?

The standard mean-variance portfolio optimization models have the form:

Model M1
\[\begin{align}\min\>&\color{darkred}{\mathit{Risk}}=\color{darkred}x^T\color{darkblue}Q\color{darkred}x\\ & \color{darkblue}r^T\color{darkred}x \ge \color{darkblue}{\mathit{MinimumReturn}} \\ &\sum_i \color{darkred}x_i = 1\\ & \color{darkred}x \ge 0\end{align}\]

or may be:

Model M2
\[\begin{align}\min\>&\color{darkred}z=\color{darkred}x^T\color{darkblue}Q\color{darkred}x - \color{darkblue}\lambda \cdot\color{darkblue}r^T\color{darkred}x\\ &\sum_i \color{darkred}x_i = 1\\ & \color{darkred}x \ge 0\end{align}\]

where

$\color{darkblue}Q$ is a variance-covariance matrix (we assume here it is positive semi-definite [1]),
$\color{darkblue}\lambda$ is an exogenous constant that sometimes is varied to draw an efficient frontier, and
$\color{darkblue}r$ are the returns.

Total Least Squares: nonconvex optimization

Total Least Squares (TLS) is an alternative for OLS (Ordinary Least Squares). It is a form of orthogonal regression and also deals with the problem of EIV (Errors-in-Variables).

The standard OLS model is \[\color{darkblue}y = \color{darkblue}X\color{darkred}\beta + \color{darkred}\varepsilon\] where we minimize the sum-of-squares of the residuals \[\min ||\color{darkred}\varepsilon||_2^2\] We can interpret $\color{darkred}\varepsilon$ as the error in $\color{darkblue}y$.

In TLS, we also allow for errors in $\color{darkblue}X$. The model becomes \[\color{darkblue}y+\color{darkred}\varepsilon=(\color{darkblue}X+\color{darkred}E)\color{darkred}\beta\] Note that we made a sign change in $\color{darkred}\varepsilon$. This is pure aesthetics: to make the equation more symmetric looking. The objective is specified as \[\min \> ||\left(\color{darkred}\varepsilon \> \color{darkred}E\right)||_F\] i.e. the Frobenius norm of the matrix formed by $\color{darkred}\varepsilon$ and $\color{darkred}E$. The Frobenius norm is just \[||A||_F=\sqrt{\sum_{i,j}a_{i,j}^2}\] We can drop the square root from the objective (the solution will remain the same, but we got rid of a non-linear function with a possible problem near zero: the gradient is not defined there). The remaining problem is a non-convex quadratic problem which can be solved with global MINLP solvers such as Baron or with a global quadratic solver like Gurobi.

Solving linear complementarity problems without an LCP solver

Introduction

The linear complementarity problem (LCP) can be stated as [1]:

LCP Problem
\[\begin{align} &\text{Find $\color{darkred}x\ge 0,\color{darkred}w\ge 0$ such that:}\\ & \color{darkred}x^T\color{darkred}w = 0 \\ &\color{darkred}w =\color{darkblue}M \color{darkred}x + \color{darkblue}q\end{align}\]

In many cases we require that $\color{darkblue}M$ is positive definite.

The LCP has different applications, including in finance, physics, and economics [2]. Off-the-shelf LCP solvers are not widely available. Here I will show how to solve LCPs with other solvers. This question comes up now and then, so here is a short recipe post.

Clustering models

In Statistics (nowadays called Data Science or A.I. for public relations reasons), clustering is one of the most popular techniques available. Of course, nothing beats linear regression in the popularity contest. Here, I like to discuss two clustering models: $k$-means and $k$-medoids. For these models, there exist well-defined equivalent Mixed-Integer Programming models. In practice, they don't work very well except for small data sets. I think they are still useful to discuss for different reasons:

The formulations are interesting. They have some angles that may not be obvious at first.
They define the problem in a succinct way. Verbal descriptions are always fraught with imprecision and vagueness. A reference model can help to make things explicit. I.e. use a model as a documentation tool.
Not all data sets are large. For small data sets, we can prove optimality where the usual heuristics only can deliver good solutions, without much information about the quality of the solution. Obviously, clustering is often used in situations where ultimate optimality may not matter much, as it is frequently used as an exploratory tool.
We can adapt the model to special cases. Adding constraints such as a minimum and maximum number of points per cluster comes to mind [3].

What is this BRATIO option in GAMS?

This is a note about BRATIO, a very exotic GAMS option related to providing an advanced basis to a Simplex-based LP (or NLP) solver.

Inverting a matrix by LP

The questions for this post are:

Can we model the inversion of a matrix as an LP problem? (A: yes)
Should we do that? (A: no)
If we insist, can we make it a bit quicker (A: yes)

Parallel machine scheduling II, three more formulations

In [1] two formulations were discussed for a scheduling problem with multiple machines. Here we add a few more. Some of them are a bit strange. None of them really works better than the ones in [1].

So this is a collection of formulations that may sound reasonable at first, but are really not suited for this problem. If you want to read about good models, skip this post.

This is really a modeling issue

In [1], the following scenario is described:

A MIP model is solved, delivering a feasible solution.
All integer variables are fixed to their levels.
The resulting LP model is solved in order to produce duals. This model turns out to be infeasible.

Mathematical Notation

I often tell beginners in optimization modeling that before starting to code, they should get a piece of paper and write down the mathematical model. There are several reasons for this:

Parallel machine scheduling I, two formulations

Scheduling problems with multiple machines are not that easy for MIP solvers. However, modern solvers are capable to solve reasonably sized problems to optimality quite efficiently. Here are some notes.

Tuesday, December 28, 2021

Sunday, December 26, 2021

Thursday, November 18, 2021

Sunday, November 7, 2021

Monday, November 1, 2021

Thursday, October 21, 2021

Saturday, October 16, 2021

Monday, October 11, 2021

Tuesday, September 28, 2021

Friday, September 17, 2021

Friday, September 10, 2021

Tuesday, August 31, 2021

Inverting a dense matrix

Sunday, August 15, 2021

Sunday, August 8, 2021

Saturday, July 31, 2021

Tuesday, July 27, 2021

Tuesday, July 20, 2021

Monday, July 12, 2021

Monday, July 5, 2021

Wednesday, June 30, 2021

Saturday, June 12, 2021

Thursday, June 10, 2021

Tuesday, June 8, 2021

Wednesday, June 2, 2021

Sunday, May 23, 2021

Introduction

Monday, May 10, 2021

Thursday, April 29, 2021

Tuesday, April 20, 2021

Monday, April 19, 2021

Monday, April 5, 2021

Monday, March 29, 2021

Sunday, March 28, 2021