Monday, November 4, 2024

Sorting using a MIP model

This is not, per se, very useful, but sorting a parameter inside a MIP model is not very easy for MIP solvers. Obviously, if you need a sorted parameter in the model, it is better to use a sorting algorithm. But useless models can still be interesting.

I use: 

    Input: a 1-dimensional parameter with values.
    Output: a 1-dimensional variable with the values sorted in ascending order.

We can implement this with a permutation matrix \(\color{darkred}X\), which is a permuted identity matrix. In a MIP context, this becomes a binary variable with some assignment constraints.


MIP Model for sorting \(p_i\)
\[\begin{align}\min\>&\color{darkred}z=0\\& \sum_i \color{darkred}x_{i,j} = 1&&\forall j\\ & \sum_j \color{darkred}x_{i,j} = 1&&\forall i\\ & \color{darkred}y_i = \sum_j \color{darkred}x_{i,j}\cdot\color{darkblue}p_j\\& \color{darkred}y_i \ge \color{darkred}y_{i-1}\\ & \color{darkred}x_{i,j} \in \{0,1\}\end{align}\]

Saturday, October 26, 2024

PuLP surprises

Formulating optimization models inside traditional programming languages such as Python is very popular. The main tool the developers use to make this possible is operator overloading. There are cases, where we can write code that looks somewhat reasonable, is accepted and processed without any warning or error messages, but is total nonsense. It is rather difficult with this approach to make things airtight. Especially error handling. In [1], we see a good example. I have created a small fragment here that illustrates the problem.

Tuesday, October 22, 2024

Non-convex Quadratic Integer Programming

Here, I want to revisit a particular model from [1]:


Model 3: Quadratic Preemptive Model
\[\begin{align}\max\>&\color{darkred}z_{model3}=\sum_p \color{darkred}z_p \\ & \color{darkred}z_p = \sum_{p',g} \color{darkblue}{\mathit{pref}}_{p,p'}\cdot \color{darkred}{\mathit{assign}}_{p,g}\cdot\color{darkred}{\mathit{assign}}_{p',g} \\ & \color{darkblue}z_{model2}^* \le \color{darkred}z_p & \forall p\\ & \sum_g \color{darkred}{\mathit{assign}}_{p,g} = 1 & \forall p \\ & \sum_p \color{darkred}{\mathit{assign}}_{p,g} = \color{darkblue}{\mathit{groupSize}} & \forall g \\& \color{darkred}{\mathit{assign}}_{p,g} \in \{0,1\}\end{align}\]

Thursday, October 17, 2024

Equity in optimization models

In optimization models, we often use an aggregate measure in the objective function, such as total profit, the sum of tardiness of jobs, and countrywide GDP. This can lead to particularly bad results for some individuals or groups.

Here is an example I have used on several occasions. 

Problem Statement

We have \(P\) persons. They must be assigned to \(M\) groups or teams. For simplicity, we can assume \(n\) is a multiple of \(m\), and the group size is \[\frac{N}{M}\] Each person \(p_1\) specifies some preferences to be placed in the same group as a person \(p_2\). A negative preference can be used to indicate that I prefer to be in a different group. Find an optimal assignment taking into account these preferences. 

Wednesday, October 16, 2024

GAMS 48 tests

Some minor quibbles. 

gdx2sqlite

The latest version of GAMS contains a replacement of gdx2sqlite. This dumps a GDX file into a SQLite database. It is a tool I use a lot. Here is a comparison using the indus89 model in the GAMS model library:

Wednesday, October 2, 2024

Prevent Loops in GAMS

 




This book [1] on DEA models has an accompanying website with all the GAMS models [2]. 

Of course, I'll be doing some nitpicking on the GAMS code. 

Saturday, September 28, 2024

CSV readers mutilating my data

R and CSV files

When I deal with regional codes such as FIPS[1] and HUC[2], CSV file readers often mutilate my regions. Here is an example in R


Saturday, September 21, 2024

Solving DEA Models with GAMS

Data Envelopment Analysis (DEA) models are somewhat special. They typically consist of small LPs, of which a whole bunch have to be solved. The CCR formulation (after [1]), for the \(i\)-th DMU (Decision Making Unit), can be stated as [2]:


CCR LP Model
\[\begin{align} \max \>& \color{darkred}{\mathit{efficiency}}_i=\sum_{\mathit{outp}} \color{darkred}u_{{\mathit{outp}}} \cdot \color{darkblue}y_{i,{\mathit{outp}}} \\ & \sum_{\mathit{inp}} \color{darkred}v_{{\mathit{inp}}} \cdot \color{darkblue}x_{i,{\mathit{inp}}} = 1 \\ & \sum_{\mathit{outp}} \color{darkred}u_{{\mathit{outp}}} \cdot \color{darkblue}y_{j,{\mathit{outp}}} \le \color{darkred}v_{{\mathit{inp}}} \cdot \color{darkblue}x_{j,{\mathit{inp}}} && \forall j \\ & \color{darkred}u_{{\mathit{outp}}} \ge 0, \color{darkred}v_{{\mathit{inp}}} \ge 0 \end{align}\]

Wednesday, September 4, 2024

Multiple Solutions in Minimum Spanning Tree example

In [1], I discussed some LP and MIP formulations for the Minimum Spanning Tree (MST) problem. 


Minimum Spanning Tree visualized through Google Maps


Here, I focus on two formulations: a multicommodity network approach (this can be solved as a large LP) and a MIP formulation based on techniques we know from the Traveling Salesman Problem (TSP). The main issue I want to discuss is the presence of multiple optimal solutions.

Sunday, September 1, 2024

N-queens and solution pool

In [1], I described some chess-related problems. Here, I want to reproduce the \(n\)-queens problem. The single solution problem, placing as many queens on the chess board as possible so they don't attack each other, is pretty standard. I want to focus on the more complex question: How many different ways can we place those queens? In other words: what are all the optimal solutions? We can do this by adding a no-good constraint that forbids the previously found solution. However, as this problem has more than a handful of different solutions, I want to use the Cplex solution pool.

Single Solution Model

We define the decision variables as: \[\color{darkred}x_{i,j} = \begin{cases} 1 & \text{if we place a queen on the square $(i,j)$} \\ 0 & \text{otherwise}\end{cases}\] 

Chess Board


Wednesday, August 28, 2024

Circle Packing and HTML reporting

Little example. Here, we try to pack \(n\) circles with a given radius \(r_i\) into a larger disc with an unknown radius \(R\). The goal is to minimize \(R\). The underlying model is simple: 

Packing of Circles
\[\begin{align} \min\> & \color{darkred}R \\ & \sum_c \left(\color{darkred}p_{i,c}-\color{darkred}p_{j,c}\right)^2 \ge \left(\color{darkblue}r_i+\color{darkblue}r_j\right)^2 & \forall i\lt j \\ & \sum_c \color{darkred}p_{i,c}^2 \le \left(\color{darkred}R-\color{darkblue}r_i\right)^2 & \forall i \\ & \color{darkred}R \ge 0\\ & c \in \{x,y\} \\ \end{align}\]

Monday, August 12, 2024

Revised Simplex LP Solver written in GAMS

I am teaching some GAMS classes, and a question arose: "How does the Simplex method work?" It's not easy to answer in a few sentences, but I want to touch upon the concept of a basis anyway. Once you have a good intuition of what a basis is, a simple Simplex method is not so far-fetched. I find the tableau presentation somewhat confusing and far removed from what actual Simplex solvers do. I strongly prefer the Revised Simplex Method in matrix notation. 

Minor rant: I just don't understand the appeal of the tableau method. It looks to me like an invention for torturing undergrad students. Most of all, it is not very structure-revealing; it does not help you understand the underlying concepts. But about 100% of the LP textbooks insist we should learn that first.

As a gimmick, I implemented a simplified version in the GAMS language. This reminds me that someone spent the effort writing a Basic interpreter in TeX [1]. This is probably just as useful.

Monday, July 15, 2024

Sunday, June 30, 2024

Inflation is a difficult concept for many

Last friday, 6/28, new PCE (Personal Consumption Expenditures Price Index) data were released. The year-on-year inflation numbers decreased from 2.7% last month to 2.6% [1]: 



Let's see how the popular press reports this [2]:

Wednesday, May 15, 2024

Another very small but very difficult global NLP model

The goal of this exercise is to fill a square area \([0,250]\times[0,100]\) with 25 circles. The model can choose the \(x\) and \(y\) coordinates of the center of each circle and the radius. So we have as variables \(\color{darkred}x_i\), \(\color{darkred}y_i\), and \(\color{darkred}r_i\). The circles placed inside the area should not overlap. The objective is to maximize the total area covered. 

A solution is:


Thursday, May 9, 2024

Modeling surprises

Here is an example where the PuLP modeling tool goes berserk.

In standard linear programming, only \(\ge\), \(=\) and \(\le\) constraints are supported. Some tools also allow \(\ne\), which for MIP models needs to be reformulated into a disjunctive constraint. Here is an attempt to do this in PuLP [1]. PuLP does not support this relational operator in its constraints, so we would expect a meaningful error message.

Monday, May 6, 2024

Rounding inside an optimization model

In [1], the question was asked: how can I round to two decimal places inside an optimization model? I.e., \[\color{darkred}y_{i,j} = \mathbf{round}(\color{darkred}x_{i,j},2)\] To get this off my chest first: I have never encountered a situation like this. Rounding to two decimal places is more for reporting than something we want inside model equations. Given that, let me look into this modeling problem a bit more as an exercise. 

Monday, April 15, 2024

LP in statistics: The Dantzig Selector

Lots of statistical procedures are based on an underlying optimization problem. Least squares regression and maximum likelihood estimation are two obvious examples. In a few cases, linear programming is used. Some examples are:

  • Least absolute deviation (LAD) regression [1]
  • Chebyshev regression [2]
  • Quantile regression [3]
Here is another regression example that uses linear programming. 

We want to estimate a sparse vector \(\color{darkred}\beta\) from the linear model \[\color{darblue}y=\color{darkblue}X\color{darkred}\beta+\color{darkred}e\] where the number of observations \(n\) (rows in \(\color{darkblue}X\)) is (much) smaller than the number of coefficients \(p\) to estimate (columns in \(\color{darkblue}X\)) [4]: \(p \gg n\). This is an alternative to the well-known Lasso method [5].

Friday, April 12, 2024

Instead of integers use binaries

In [1], a small (fragment of a) model is proposed:

High-Level Model
\[\begin{align} \min\> & \sum_i | \color{darkblue}a_i\cdot \color{darkred}x_i| \\ & \max_i |\color{darkred}x_i| = 1 \\ & \color{darkred}x_i \in \{-1,0,1\} \end{align}\]

Can we formulate this as a straight MIP? 

Thursday, March 28, 2024

Water



  • Fascinating map with annual water throughput. 
  • This is related to water availability for irrigation. An important topic.
  • The Rio Grande is not so grand here.
  • It must not be completely trivial to produce this map.
  • See: 
    Peter Gleick and Matthew Heberger, American Rivers: A Graphic, https://pacinst.org/american-rivers-a-graphic/


Saturday, February 10, 2024

Math vs Programming

 A programmer writes about this blog:



(It is old, but I just came across this).

In my previous post, I just argued the other way around. To make sure: I don't hate programmers.

BTW, in quite a few programming languages for loops are very slow, and need to be replaced by something like sum(). Examples: Python, R, SQL. 

Thursday, February 8, 2024

Small non-convex MINLP: Pyomo vs GAMS

 In [1], the following Pyomo model (Python fragment) is presented:


model.x = Var(name="Number of batches", domain=NonNegativeIntegers, initialize=10)                    
model.a = Var(name="Batch Size", domain=NonNegativeIntegers, bounds=(5,20))

# Objective function
def total_production(model):
    return model.x * model.a
model.total_production = Objective(rule=total_production, sense=minimize)

# Constraints
# Minimum production of the two output products
def first_material_constraint_rule(model):
    return sum(0.2 * model.a * i for i in range(1, value(model.x)+1)) >= 70
model.first_material_constraint = Constraint(rule=first_material_constraint_rule)

def second_material_constraint_rule(model):
    return sum(0.8 * model.a * i for i in range(1, value(model.x)+1)) >= 90
model.second_material_constraint = Constraint(rule=second_material_constraint_rule)

# At least one production run
def min_production_rule(model):
    return model.x >= 1
model.min_production = Constraint(rule=min_production_rule)

Tuesday, January 30, 2024

One nonzero in set of free variables

In [1] the following question is posed:

I have free variables \(\color{darkred}x_i\). How can I impose the constraint that at least one of the variables is nonzero: \(\color{darkred}x_i\ne 0\).

Tuesday, January 16, 2024

Informs Test of Time Award for CONOPT paper

The Test of Time Award for papers published in the INFORMS Journal on Computing in the years 1993–1997 is awarded to

CONOPT: A Large-Scale GRG Code

Arne Stolbjerg Drud

ORSA Journal on Computing 6(2):207–216, 1994 


As Arne notes in [1], he is helped a bit by the fact that CONOPT users may want to cite a published paper (and because there is no newer successor paper). Still, this is quite an achievement. 

Monday, January 8, 2024

GAMS listing file: missing Unicode support

Newer versions of GAMS allow UTF-8 encoded strings as labels. That is very welcome, as these labels may come from data sources that just use Unicode characters. However, when printing to the listing file, we miss proper Unicode support. At first, I thought, "OK, just a few misaligned tables. No big deal." Here is a constructed example showing this may be a bit more problematic.

Thursday, January 4, 2024

String Art




 

In [1], a greyscale picture is approximated by strings (lines) between points around the image. Here, I will try something similar with a formal optimization model.