Wednesday, May 10, 2023

Generate all solutions that sum up to one

In a post, the following question was posed:

We can select unique values  \(\displaystyle\frac{1}{i}\) for \(i=1,\dots,n\). Find all combinations that add up to 1.

A complete enumeration scheme was slow even for \(n=10\). Can we use a MIP model for this or something related?

A single solution is easily found using the model:


Mathematical Model
\[ \begin{align} & \sum_{i=1}^n \frac{1}{i} \cdot \color{darkred}x_i = 1 \\ & \color{darkred}x_i \in \{0,1\} \end{align}\]

Sunday, May 7, 2023

Finding common patterns

In [1], the following problem is stated:

Given a boolean matrix, with \(m\) rows and \(n\) columns, find the largest pattern of ones that is found in at least \(\color{darkblue}K\) rows. We can ignore cells where the pattern has a zero value: they don't count.

A small example [1] is given: 

row 1:[0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1]
row 2:[0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1]
row 3:[0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1]
row 4:[1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1]
row 5:[1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1]
row 6:[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1]


 With \(K=3\), we can form a pattern with 10 nonzero elements:

 
row 1:  [0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 1]
row 2:  [0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1]
row 3:  [0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1]
row 4:  [1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1]
row 5:  [1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1]
row 6:  [1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1] 
pattern:[1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1]

The pattern is shared by rows 4, 5, and 6.

Can we formulate a MIP model for this? My first attempt is as follows.

Tuesday, May 2, 2023

Solving as network with lowerbounds

In [1], we looked at the following problem:


Mathematical Model
\[ \begin{align} \min& \sum_{i,j} \color{darkblue}a_{i,j} \cdot \color{darkred}x_{i,j} \\ & \sum_j \color{darkblue}a_{i,j}\cdot \color{darkred}x_{i,j} \ge \color{darkblue}r_i && \forall i \\ & \sum_i \color{darkblue}a_{i,j}\cdot \color{darkred}x_{i,j} \ge \color{darkblue}c_j && \forall j \\ & \color{darkred}x_{i,j} \in \{0,1\} \end{align}\]

Wednesday, April 26, 2023

A large MIP model that should be solved as LP: the root node

In this post, I want to discuss an observation about the root node when solving a MIP model.

Problem Description


The model I want to tackle here is from [1]:

We have a large matrix \(\color{darkblue}A=\color{darkblue}a_{i,j}\) with values 0 and 1. In addition, we have a minimum on the row and column totals. These are called \(\color{darkblue}r_i\) and \(\color{darkblue}c_j\). The goal is to remove as many 1's in the matrix \(\color{darkblue}A\) subject to these minimum row and column totals.

Tuesday, April 18, 2023

Some GAMS embedded Python notes

Here are two issues you may want to be aware of. The discussion below is relevant for Windows and not so much for Unix variants.


Using Python Raw strings for directories


Here we use some Python inside an otherwise empty GAMS model:

$onEmbeddedCode Python:

dir = r"%system.fp%"
print(dir)

$offEmbeddedCode 


Wednesday, April 5, 2023

In-process, in-memory databases

There are a few database systems that are a bit different. They are libraries that can be linked directly to your application. Linking can be done statically (during the compilation/linking step) or dynamically (using a shared library or DLL). Here I want to show two cases:

  • SQLite [1] used from R on data frames
  • DuckDB [2] used from Python, again on data frames
So these databases don't only run inside R or Python but also can operate directly on data frames.

Tuesday, April 4, 2023

Compression and large tables (Excel and CSV)




Here is an interesting little experiment: load a large CSV file into Excel. My original Powerpoint slide from a recent presentation was not as complete as it should be. Here is more size info:

Saturday, March 25, 2023

Simultaneous equation models and data errors

My experience is that using (optimization) models is a great way to provide quality assurance on the data. Data collection can be very difficult and expensive. It can also be quite easy to have errors cropping up somewhere along the way. When using an optimization model, we will get hammered by such errors.

Thursday, March 16, 2023

Algorithm vs. model

From [1]:


We are given a plane defined by Ax+By+Cz-D = 0 where D is significantly larger than A,B,C and GCD(A,B,C) = 1. How would I find all points (x, y, z), where x,y,z are integers and >= 0, that lie on the plane in an efficient manner?


So the poster asks for an algorithm to find \(\color{darkred}x,\color{darkred}y,\color{darkred}z \in \{0,1,2,\dots\}\) such that \[\color{darkblue}A \cdot \color{darkred}x + \color{darkblue}B \cdot \color{darkred}y + \color{darkblue}C \cdot \color{darkred}z = \color{darkblue}D\] Besides the assumptions stated in the question, I'll further assume \(\color{darkblue}A,\color{darkblue}B,\color{darkblue}C,\color{darkblue}D\gt 0\).

Tuesday, March 14, 2023

Choosing between NLP solvers: interior point or active set.

One way to categorize (local) nonlinear programming (NLP) solvers is active set methods and interior point solvers. Some representative large-scale sparse solvers are:

  • Active set: CONOPT, SNOPT. These are using SQP algorithms.
  • Interior point: IPOPT, Knitro. Note: Knitro also contains an active set algorithm.
These are two very different types of algorithms. They work very well on different types of problems. If you do a lot of nonlinear optimization, it actually makes sense to have both types of solvers available. With modeling tools like AMPL and GAMS, it is very easy to switch between solvers. Testing performance between two solvers is then also a snap.

Here I want to discuss some factors that can help in deciding whether to use an active set solver or an interior point solver.