Wednesday, August 10, 2022

Large sparse transportation model with CVXPY,CVXR

In [1] I was trying out different formulations of a large, sparse (but very easy) transportation model using different modeling tools and solvers. The conclusion was:

  • Exploiting sparsity leads to the best performance
  • GAMS/Cplex is about 10 times as fast as Python/PuLP/CBC
  • GAMS/Cplex is about 1000 times as fast as R/OMPR/GLPK

Don't overinterpret these numbers: this is a single model, which may not be at all representable for your models. Let's see how CVXPY[2] and CVXR[3] are doing. To recap we want to solve a large (but easy) transportation model:

Dense Transportation Model
\[\begin{align}\min&\sum_{i,j}\color{darkblue}c_{i,j}\cdot\color{darkred}x_{i,j}\\ & \sum_j \color{darkred}x_{i,j} \le \color{darkblue}a_i && \forall i \\& \sum_i \color{darkred}x_{i,j} \ge \color{darkblue}b_j && \forall j \\ & \color{darkred}x_{i,j} \ge 0 \end{align}\]

The additional wrinkle is that only a small fraction of the links \(i \rightarrow j\) exist.

There are (at least) three ways to model this:

  1. Use a large cost coefficient for forbidden links.
  2. Fix \(\color{darkred}x_{i,j}=0\) for non-existing links.
  3. Exploit sparsity directly by only summing over existing links.
CVXPY/CVXR are matrix-based modeling tools. That gives some extra things to think about. A transportation model in matrix-notation can look like:

Transportation Model in Matrix Notation
\[\begin{align}\min\>&{\bf tr}(\color{darkblue}C^T\color{darkred}X)\\&\color{darkred}X\color{darkblue}e \le \color{darkblue}a \\ & \color{darkred}X^t\color{darkblue}e \ge \color{darkblue}b \\ & \color{darkred}X \ge 0 \end{align}\]

Monday, August 1, 2022


Newer versions of Windows come with a very useful tool to download files: curl.exe.

The standard help is:

C:\Users\erwin>where curl



C:\Users\erwin>curl --help

Usage: curl [options...] <url>

 -d, --data <data>          HTTP POST data

 -f, --fail                 Fail fast with no output on HTTP errors

 -h, --help <category>      Get help for commands

 -i, --include              Include protocol response headers in the output

 -o, --output <file>        Write to file instead of stdout

 -O, --remote-name          Write output to a file named as the remote file

 -s, --silent               Silent mode

 -T, --upload-file <file>   Transfer local FILE to destination

 -u, --user <user:password> Server user and password

 -A, --user-agent <name>    Send User-Agent <name> to server

 -v, --verbose              Make the operation more talkative

 -V, --version              Show version number and quit


This is not the full help, this menu is stripped into categories.

Use "--help category" to get an overview of all categories.

For all options use the manual or "--help all".



A model with semi-continuous variables

In [1], a selection problem is posted, that turns out to be an easy MIP. The problem is:

From a (large) collection of products select those products with the highest unit return (or value) subject to:
  • A product can only be ordered between a minimum and maximum amount
  • There is a budget w.r.t. the total amount of products
This can be modeled using so-called semi-continuous variables: variables that can assume zero or a value between the constants \(\color{darkblue}L\) and \(\color{darkblue}U\). So we can write:

MIP model with semi-continuous variables
\[\begin{align} \max & \sum_i \color{darkred}x_i \cdot \color{darkblue}{\mathit{rate}}_i \\ & \sum_i \color{darkred}x_i \le \color{darkblue}{\mathit{budget}} \\ & \color{darkred}x_i \in \{0\}\cup [\color{darkblue}L_i,\color{darkblue}U_i] \end{align} \]