Monday, June 22, 2026

Mixture models as math programming problem

We can formulate a linear least squares regression model as an optimization problem. This is not how these problems are solved in statistical packages. Often, they use a QR decomposition. A real optimization formulation can be useful when we need to add unusual constraints that the statistical package does not support directly, or when we need to optimize a special maximum-likelihood function. 

Here is another example of a least-squares regression problem where we can benefit from mathematical programming techniques.

Data


Statistical problems typically start with a data set:

----     79 PARAMETER data  

                 x           y

case1       20.202      85.162
case2        0.507       2.103
case3       26.961      55.969
case4       49.985      44.690
case5       15.129      86.515
case6       17.417      79.866
case7       33.064      56.328
case8       31.691      29.422
case9       32.209      64.021
case10      96.398      85.191
case11      99.360      68.235
case12      36.990      57.516
case13      37.289      25.884
case14      77.198      56.157
case15      39.668      58.398
case16      91.310      66.205
case17      11.958      93.742
case18      73.548      28.178
case19       5.542       5.788
case20      57.630      60.830
case21       5.141      53.988
case22       0.601      42.559
case23      40.123      61.928
case24      51.988      42.984
case25      62.888      58.308
case26      22.575       4.414
case27      39.612      67.282
case28      27.601      56.445
case29      15.237       0.218
case30      93.632      11.896
case31      42.266      60.515
case32      13.466      51.721
case33      38.606      65.392
case34      37.463      16.978
case35      26.848      74.588
case36      94.837      -0.803
case37      18.894      60.060
case38      29.751      14.005
case39       7.455      60.066
case40      40.135      62.898





It looks like there is not a single regression line, but rather three of them hidden in this data set. So we should find something like:

 

The question is: how can we estimate the intercept and slope of these three lines?

Tuesday, June 2, 2026

MINLP instead of indicator constraints?

In this post, I want to discuss indicator constraints and how to replace them with simple multiplications. As we shall see, this is a somewhat harebrained but still interesting idea. 

Indicator constraints are implications of the form: \[\delta=0 \implies \text{linear constraint}\] or \[\delta=1 \implies \text{linear constraint}\] where \(\delta \in \{0,1\}\) is a binary decision variable.

There are two aspects of indicator constraints: 
  • Indicator constraints help with MIP models where otherwise we would use big-M constraints. This will help address the numerical issues that result from using big-M constraints. These include small (and sometimes not-so-small) solution values where you expect zero, and poor solver performance. With big-M constraints, we need to pay much attention to the size of the big-M constants. This is the algorithmic aspect.
  • Indicator constraints provide a convenient modeling construct. They form a useful abstraction that makes MIP modeling easier and more straightforward. This is the modeling aspect.
Another example where a solver feature has such a dual benefit (algorithmic and modeling) are SOS2 sets for piecewise linear functions.