Friday, September 17, 2021

Reallocate people: a very small but interesting LP

U-Haul Moving Van [link]

This is a strange little problem [1]. We have individuals (of different categories) and regions with a given capacity. The initial set of assigned persons fits within the capacity of each region. Now a new batch of individuals arrives with their chosen zones. Unfortunately, the total demand (old + new individuals) does not fit anymore in all cases. This means we need to reallocate people. We can do this at a price (the costs are given). What is the optimal (least cost) new allocation?

Friday, September 10, 2021

Huber regression: different formulations

Peter Huber, Swiss statistician 

Traditionally, regression, well-known and widely used, is based on a least-squares objective: we have quadratic costs. A disadvantage of this approach is that large deviations (say due to outliers) get a very large penalty. As a result, these outliers can affect the results dramatically.

One alternative regression method that tries to remedy this is L1- or LAD-regression (LAD=Least Absolute Deviation) [1]. Here I want to discuss a method that is somewhere in the middle between L1 regression and least-squares: Huber or M-Regression [2].  Huber regression uses a loss function that is quadratic for small deviations and linear for large ones. It is defined in a piecewise fashion. The regression can look like the following non-linear optimization problem:

NLP Model
\[\begin{align} \min_{\color{darkred}\beta,\color{darkred}\varepsilon} &\sum_i \rho(\color{darkred}\varepsilon_i) \\ &\color{darkblue}y_i = \sum_j \color{darkblue}x_{i,j}\color{darkred}\beta_j +\color{darkred}\varepsilon_i \end{align}  \] where \[\rho(\color{darkred}\varepsilon_i) = \begin{cases}\color{darkred}\varepsilon^2_i & \text{if $|\color{darkred}\varepsilon_i | \le \color{darkblue}k$} \\  2\color{darkblue}k |\color{darkred}\varepsilon_i |-\color{darkblue}k^2 & \text{otherwise}\end{cases} \]

Tuesday, August 31, 2021

matrix operations via GAMS Embedded Python: multi-regional Input-Output tables

Wassily Leontief in front of an IO table, credit:NYU

Inverting a dense matrix

GAMS allows running pieces of Python code as part of a GAMS model. Unfortunately, the interface is rather low-level and inefficient.  Usually "low-level" is associated with high-performance, but in the Python world, this is not the case. Here is an example.

Sunday, August 15, 2021

Stable Marriage Problem

An inter-cast marriage ceremony in Lalitpur [6]

The Stable Marriage Problem is a fascinating problem. In this problem, we want to assign ("marry") men to women much like the assignment problem. (I am just following the conventions in the literature here.) There is a twist, however. We want to require that the matchings are stable: there does not exist a combination \((m,w)\) such that \(m\) and \(w\) both prefer each other above their current partner [1].

There are famous algorithms for this problem [2]. But here, I want to look at it as an integer programming problem.

Sunday, August 8, 2021

A network model: Pyomo vs GAMS

Network models are an important class of optimization models, in itself but also as part of larger models. In [1] a nice presentation is given on how to set up a network model in the Python-based modeling tool Pyomo.


The simple shortest path problem is reproduced here:

Saturday, July 31, 2021

A scheduling problem: a modeling approach

 Here is a simple scheduling problem [1]:

  • We have \(T\) time periods (weeks)
  • and \(N\) jobs to be scheduled.
  • Each job has a duration or processing time (again expressed in weeks), and a given resource (personnel) requirement during the execution of the job. This demand is expressed as: \(\color{darkblue}{\mathit{resource}}_{j,t}\)  where \(j\) indicates the job and \(t\) indicates the week since the start of the job.
  • Not in [1], but I added them to make things a bit more realistic: we have some release dates indicating that a job may not start before a given date. A reason can be that raw material is not available before that date.
  • Similarly, I added due dates: a job must finish before a given date.
  • The objective is to minimize the maximum amount of resources we need over the whole planning period. We can think of this as the capacity we need (e.g. number of personnel).
In this post, I want to show a standard approach to model this problem and opposed to that, how I would attack this.

Tuesday, July 27, 2021

A variant of an assignment problem

 In one [1], the following problem is sketched:

  • We have different types of workers. In the data set below, I call them type-A and type-B workers. 
  • We need to assign workers to jobs.
  • We have more workers than jobs, so we can execute all jobs. 
  • This is a standard assignment problem: \[\begin{align} \min &\sum_{w,j} \color{darkblue}c_{w,j} \color{darkred}x_{w,j}\\ & \sum_j \color{darkred}x_{w,j} \le 1 &&\forall w \\ & \sum_w \color{darkred}x_{w,j} = 1 && \forall j \\ & \color{darkred}x_{w,j} \in [0,1] \end{align}\]
  • The complication is that we can form teams of two workers, one of type A and one of type B. Such a team can work faster and execute the job in less time.
  • How to handle this?

I don't know how to handle this within an assignment problem, but stating this as a MIP allows us to make some progress. Allowing two persons to work on a job is easy. Just replace the second constraint by \[1 \le \sum_w \color{darkred}x_{w,j} \le 2\] Unfortunately this ignores that we need one worker of each type in a team. Also, this makes it difficult to impose a cost for each specific team.

Tuesday, July 20, 2021

Spatial Equilibrium

Just a quick experiment with a small price-endogenous spatial equilibrium model.

In this problem, we consider commodities (goods) produced and consumed in different regions. We can trade between regions.

This price-endogenous spatial model will find both equilibrium prices, supplies, demand quantities, and trade patterns. Our model will have a connection with the transportation model. The difference is that we formulate the model as a system of complementarity constraints (no objective), with the duals explicitly in the model as variables. Or, in other words, we solve the KKT conditions.

I use a small data set from [3], and see if we can reproduce things.

Monday, July 12, 2021

GAMS Singleton Set Issue

There is a nasty problem with the singleton set in GAMS. A singleton set is a set that can contain only zero or one element. Adding more elements implies replacement. I don't use it very often. Here is a reason why. 

Consider the small example:

set i /i1*i20/;
parameter p(i);
p(i) = uniform(0,10);

* find last element >= 5
scalar threshold /5/;
singleton set k(i);
  k(i)$(p(i)>=threshold) =

display p,k;

I would expect to see as output for k the last element in p(i) that is larger than 5. However, we see:

----     12 PARAMETER p  

i1  1.717,    i2  8.433,    i3  5.504,    i4  3.011,    i5  2.922,    i6  2.241,    i7  3.498,    i8  8.563
i9  0.671,    i10 5.002,    i11 9.981,    i12 5.787,    i13 9.911,    i14 7.623,    i15 1.307,    i16 6.397
i17 1.595,    i18 2.501,    i19 6.689,    i20 4.354

----     12 SET k  

                                                      ( EMPTY )

Monday, July 5, 2021

Another sports schedule problem

Wimbledon 2010 Mixed Doubles (source)


In [1] a question is asked about designing a sports schedule as follows:

  • There are 16 players, 7 rounds, and 4 games per round.
  • A game consists of 2 teams with 2 players per team.
  • Each player has a rating. A team rating is the sum of the ratings of the two members.
  • A player plays exactly one game in each round.
  • A player can be partnered with another player only once.
  • A player can play against another player only twice. 
  • Design a schedule that has the minimum difference in ratings between opposing teams.

I'll formulate two different models. One with a small number of binary variables. And one with a large number of binary variables. As we shall see the model with the larger number of variables will win.