Monday, April 15, 2024

LP in statistics: The Dantzig Selector

Lots of statistical procedures are based on an underlying optimization problem. Least squares regression and maximum likelihood estimation are two obvious examples. In a few cases, linear programming is used. Some examples are:

  • Least absolute deviation (LAD) regression [1]
  • Chebyshev regression [2]
  • Quantile regression [3]
Here is another regression example that uses linear programming. 

We want to estimate a sparse vector \(\color{darkred}\beta\) from the linear model \[\color{darblue}y=\color{darkblue}X\color{darkred}\beta+\color{darkred}e\] where the number of observations \(n\) (rows in \(\color{darkblue}X\)) is (much) smaller than the number of coefficients \(p\) to estimate (columns in \(\color{darkblue}X\)) [4]: \(p \gg n\). This is an alternative to the well-known Lasso method [5].

Friday, April 12, 2024

Instead of integers use binaries

In [1], a small model is proposed:

High-Level Model
\[\begin{align} \min\> & \sum_i | \color{darkblue}a_i\cdot \color{darkred}x_i| \\ & \max_i |\color{darkred}x_i| = 1 \\ & \color{darkred}x_i \in \{-1,0,1\} \end{align}\]

Can we formulate this as a straight MIP? 

Thursday, March 28, 2024


  • Fascinating map with annual water throughput. 
  • This is related to water availability for irrigation. An important topic.
  • The Rio Grande is not so grand here.
  • It must not be completely trivial to produce this map.
  • See: 
    Peter Gleick and Matthew Heberger, American Rivers: A Graphic,

Saturday, February 10, 2024

Math vs Programming

 A programmer writes about this blog:

(It is old, but I just came across this).

In my previous post, I just argued the other way around. To make sure: I don't hate programmers.

BTW, in quite a few programming languages for loops are very slow, and need to be replaced by something like sum(). Examples: Python, R, SQL. 

Thursday, February 8, 2024

Small non-convex MINLP: Pyomo vs GAMS

 In [1], the following Pyomo model (Python fragment) is presented:

model.x = Var(name="Number of batches", domain=NonNegativeIntegers, initialize=10)                    
model.a = Var(name="Batch Size", domain=NonNegativeIntegers, bounds=(5,20))

# Objective function
def total_production(model):
    return model.x * model.a
model.total_production = Objective(rule=total_production, sense=minimize)

# Constraints
# Minimum production of the two output products
def first_material_constraint_rule(model):
    return sum(0.2 * model.a * i for i in range(1, value(model.x)+1)) >= 70
model.first_material_constraint = Constraint(rule=first_material_constraint_rule)

def second_material_constraint_rule(model):
    return sum(0.8 * model.a * i for i in range(1, value(model.x)+1)) >= 90
model.second_material_constraint = Constraint(rule=second_material_constraint_rule)

# At least one production run
def min_production_rule(model):
    return model.x >= 1
model.min_production = Constraint(rule=min_production_rule)

Tuesday, January 30, 2024

One nonzero in set of free variables

In [1] the following question is posed:

I have free variables \(\color{darkred}x_i\). How can I impose the constraint that at least one of the variables is nonzero: \(\color{darkred}x_i\ne 0\).

Tuesday, January 16, 2024

Informs Test of Time Award for CONOPT paper

The Test of Time Award for papers published in the INFORMS Journal on Computing in the years 1993–1997 is awarded to

CONOPT: A Large-Scale GRG Code

Arne Stolbjerg Drud

ORSA Journal on Computing 6(2):207–216, 1994 

As Arne notes in [1], he is helped a bit by the fact that CONOPT users may want to cite a published paper (and because there is no newer successor paper). Still, this is quite an achievement. 

Monday, January 8, 2024

GAMS listing file: missing Unicode support

Newer versions of GAMS allow UTF-8 encoded strings as labels. That is very welcome, as these labels may come from data sources that just use Unicode characters. However, when printing to the listing file, we miss proper Unicode support. At first, I thought, "OK, just a few misaligned tables. No big deal." Here is a constructed example showing this may be a bit more problematic.

Thursday, January 4, 2024

String Art


In [1], a greyscale picture is approximated by strings (lines) between points around the image. Here, I will try something similar with a formal optimization model.

Sunday, November 19, 2023

Grouping items: a difficult combinatorial problem

In [1], a simple problem is described:

  • We have \(n\) items (or orders) with a certain width. 
  • We need to combine these items in groups (called patterns) with rather tight limits on the total width. The total length of a pattern (the sum of the lengths of the items assigned to this pattern) must be between 335 and 340.
  • As a result, we may not be able to assign all items. The remaining items cannot be formed into valid patterns.
  • The objective is to try to place as many items as possible into patterns.
  • An indication of the size of the problem: \(n \approx 500\).  


Instead of immediately working on a full-known \(n=500\) problem, I generated a random data set with a very manageable \(n=50\) items. The widths were drawn from a discrete uniform distribution between 30 and 300. The data looks like:

----     15 PARAMETER w  item widths

order1   76.000,    order2  258.000,    order3  179.000,    order4  111.000,    order5  109.000,    order6   90.000
order7  124.000,    order8  262.000,    order9   48.000,    order10 165.000,    order11 300.000,    order12 186.000
order13 298.000,    order14 236.000,    order15  65.000,    order16 203.000,    order17  73.000,    order18  97.000
order19 211.000,    order20 147.000,    order21 127.000,    order22 125.000,    order23  65.000,    order24  70.000
order25 189.000,    order26 255.000,    order27  92.000,    order28 210.000,    order29 240.000,    order30 112.000
order31  59.000,    order32 166.000,    order33  73.000,    order34 266.000,    order35 101.000,    order36 107.000
order37 190.000,    order38 225.000,    order39 200.000,    order40 155.000,    order41 142.000,    order42  61.000
order43 115.000,    order44  42.000,    order45 121.000,    order46  79.000,    order47 204.000,    order48 181.000
order49 238.000,    order50 110.000

I stick to the pattern limits \(\color{darkblue}L=335\) and \(\color{darkblue}U=340\).

We need some estimate of the number of patterns to use. We could just guess. But a better approach is the following. An upper bound for the number patterns can be established quite easily: \[{\mathit{maxj}} = \left\lfloor \frac{\sum_i \color{darkblue}w_i}{\color{darkblue}L}\right\rfloor\] For our data set this number is:

----     29 PARAMETER maxj                 =       22.000  max number of patterns we can fill

This means we can safely use this number as the number of bins (patterns).