Thursday, February 25, 2021

A strange regression problem: multiple-line fitting

Consider the following data:


----     14 PARAMETER x  data

i1  17.175,    i2  84.327,    i3  55.038,    i4  30.114,    i5  29.221,    i6  22.405,    i7  34.983,    i8  85.627
i9   6.711,    i10 50.021,    i11 99.812,    i12 57.873,    i13 99.113,    i14 76.225,    i15 13.069,    i16 63.972
i17 15.952,    i18 25.008,    i19 66.893,    i20 43.536,    i21 35.970,    i22 35.144,    i23 13.149,    i24 15.010
i25 58.911,    i26 83.089,    i27 23.082,    i28 66.573,    i29 77.586,    i30 30.366,    i31 11.049,    i32 50.238
i33 16.017,    i34 87.246,    i35 26.511,    i36 28.581,    i37 59.396,    i38 72.272,    i39 62.825,    i40 46.380
i41 41.331,    i42 11.770,    i43 31.421,    i44  4.655,    i45 33.855,    i46 18.210,    i47 64.573,    i48 56.075
i49 76.996,    i50 29.781


----     14 PARAMETER y  data

i1    32.320,    i2   238.299,    i3    87.081,    i4    17.825,    i5   -10.154,    i6   -35.813,    i7    14.506
i8   204.903,    i9  -103.014,    i10   79.521,    i11 -153.679,    i12  -47.971,    i13  277.483,    i14  160.084
i15   28.687,    i16  133.650,    i17  -63.820,    i18    4.028,    i19  145.434,    i20   48.326,    i21   49.295
i22   35.478,    i23  -58.602,    i24   34.463,    i25  -50.666,    i26 -115.522,    i27  -19.540,    i28  134.829
i29  198.074,    i30   -7.351,    i31   40.786,    i32   82.107,    i33   27.661,    i34  -82.669,    i35    5.081
i36   -1.278,    i37  118.345,    i38  172.905,    i39  -57.943,    i40   56.981,    i41  -45.248,    i42   34.517
i43   11.780,    i44  -98.506,    i45   -1.064,    i46   33.323,    i47  139.050,    i48  -35.595,    i49 -102.580
i50    3.912


When we plot this, we see:



Saturday, February 20, 2021

2d bin packing with Google OR-Tools CP-SAT, how to fix?

In [1] I looked at some MIP formulations for a 2d bin packing problem. OR-Tools has a nice model.AddNoOverlap2D constraint function. So, I was excited to try this out. (In these times I get quickly enthusiastic about these things). 

The idea is as follows. ortools has this construct Constraint.OnlyEnforce If(boolean). This is essentially an indicator constraint: \((\color{darkred}x\ge 5)\text{.OnlyEnforceIf($\color{darkred}\delta$)}\) means \[\color{darkred}\delta=1 \Rightarrow \color{darkred}x \ge 5\] This inspires me to develop a model along the lines of: \[\begin{align} \min& \sum_k \color{darkred}u_k\\  & \color{darkred}\beta_{i,j} = 0 \Rightarrow \color{darkred}b_i\ne\color{darkred}b_j \\ & \color{darkred}\beta_{i,j} = 1 \Rightarrow \mathbf{NoOverlap2D}(i,j) \\ & \color{darkred}u_k = 0 \Rightarrow \color{darkred}b_i \lt k \end{align}\] Here \(\color{darkred}b_i\) is the bin number where item \(i\) is placed and \(\color{darkred}\beta_{i,j}\) is true if  \(\color{darkred}b_i=\color{darkred}b_j\).


OK, so I started coding:

Thursday, February 18, 2021

An easy variant of 2d bin packing

 In [1] the following problem is posted:

Consider the 2d bin packing problem [2]. Now we have the following additional restrictions: items of a different type or category cannot be below or above each other. In other words, we arrange items of the same type in columns. 

We use again our small data set: 


----     37 PARAMETER itemSize  sizes of bin and items

               w           h

cat1       7.000      12.000
cat2       9.000       3.000
cat3       5.000      14.000
cat4      13.000       9.000
cat5       6.000       8.000
cat6      20.000       5.000
bin       40.000      60.000


----     37 SET itemMap  mapping between items and categories

item1 .cat1,    item2 .cat1,    item3 .cat1,    item4 .cat1,    item5 .cat1,    item6 .cat1,    item7 .cat1
item8 .cat1,    item9 .cat1,    item10.cat1,    item11.cat2,    item12.cat2,    item13.cat2,    item14.cat2
item15.cat2,    item16.cat2,    item17.cat2,    item18.cat2,    item19.cat2,    item20.cat2,    item21.cat3
item22.cat3,    item23.cat3,    item24.cat3,    item25.cat3,    item26.cat3,    item27.cat3,    item28.cat3
item29.cat3,    item30.cat3,    item31.cat4,    item32.cat4,    item33.cat4,    item34.cat4,    item35.cat4
item36.cat4,    item37.cat4,    item38.cat4,    item39.cat4,    item40.cat4,    item41.cat5,    item42.cat5
item43.cat5,    item44.cat5,    item45.cat5,    item46.cat6,    item47.cat6,    item48.cat6,    item49.cat6
item50.cat6


I'll try two different models:

  • Assign items to bins, arrange them in columns
  • Assign columns to bins

Tuesday, February 16, 2021

2d Bin Packing

 In [1], a question was posed about a particular type of 2d bin-packing. This is a good reason to play with some models.

First I discuss three formulations with continuous no-overlap constraints and one for discrete data. In the next post, I'll discuss the variant described by the post [1]. This actually leads to a very different, but easy problem.


Problem statement and example data


We consider \(n=50\) rectangular items of different sizes that have to be placed into bins. There are six different categories, each with a different size (height and width). All the bins have the same size. The goal is to use as few bins as possible.


----     37 PARAMETER itemSize  sizes of bin and items

               w           h

cat1       7.000      12.000
cat2       9.000       3.000
cat3       5.000      14.000
cat4      13.000       9.000
cat5       6.000       8.000
cat6      20.000       5.000
bin       40.000      60.000


----     37 SET itemMap  mapping between items and categories

item1 .cat1,    item2 .cat1,    item3 .cat1,    item4 .cat1,    item5 .cat1,    item6 .cat1,    item7 .cat1
item8 .cat1,    item9 .cat1,    item10.cat1,    item11.cat2,    item12.cat2,    item13.cat2,    item14.cat2
item15.cat2,    item16.cat2,    item17.cat2,    item18.cat2,    item19.cat2,    item20.cat2,    item21.cat3
item22.cat3,    item23.cat3,    item24.cat3,    item25.cat3,    item26.cat3,    item27.cat3,    item28.cat3
item29.cat3,    item30.cat3,    item31.cat4,    item32.cat4,    item33.cat4,    item34.cat4,    item35.cat4
item36.cat4,    item37.cat4,    item38.cat4,    item39.cat4,    item40.cat4,    item41.cat5,    item42.cat5
item43.cat5,    item44.cat5,    item45.cat5,    item46.cat6,    item47.cat6,    item48.cat6,    item49.cat6
item50.cat6


Furthermore, we assume items cannot be rotated.

Friday, February 5, 2021

Non-differentiable NLP vs LP

 In [1] the following simple problem is posed:


Given data 

p = [50, 50.5, 52, 53, 55, 55.5, 56, 57,60, 61]

try to track this as close as possible by minimizing the largest deviation while obeying the restrictions:


\[\begin{align}&\color{darkred}x_1 = \color{darkblue}p_1\\ &\color{darkred}x_i - \color{darkred}x_{i-1}\le 1.5\end{align}\]


Non-differentiable NLP model


An obvious way to formulate this is: \[\begin{align} \min&\max_i |\color{darkred}x_i-\color{darkblue}p_i| \\ &\color{darkred}x_1 = \color{darkblue}p_1\\ &\color{darkred}x_i - \color{darkred}x_{i-1}\le 1.5\end{align}\]

This is a non-differentiable formulation, and we are asking for trouble when trying to solve this using a standard nonlinear NLP solver. For instance, when I solve this with CONOPT, we may see:


               S O L V E      S U M M A R Y

     MODEL   m1                  OBJECTIVE  z
     TYPE    DNLP                DIRECTION  MINIMIZE
     SOLVER  CONOPT              FROM LINE  25

**** SOLVER STATUS     4 Terminated By Solver      
**** MODEL STATUS      7 Feasible Solution         
**** OBJECTIVE VALUE                1.2500

 RESOURCE USAGE, LIMIT          0.062 10000000000.000
 ITERATION COUNT, LIMIT        31    2147483647
 EVALUATION ERRORS              0             0
CONOPT 3         33.2.0 r4f23b21 Released Dec 01, 2020 WEI x86 64bit/MS Window
 
 
    C O N O P T 3   version 3.17L
    Copyright (C)   ARKI Consulting and Development A/S
                    Bagsvaerdvej 246 A
                    DK-2880 Bagsvaerd, Denmark
 
 
    The model has 11 variables and 10 constraints
    with 29 Jacobian elements, 10 of which are nonlinear.
    The Hessian of the Lagrangian has 10 elements on the diagonal,
    45 elements below the diagonal, and 10 nonlinear variables.
 
                   Pre-triangular equations:   0
                   Post-triangular equations:  10
 
 
 ** Feasible solution. Convergence too slow. The change in objective
    has been less than 3.7500E-12 for 20 consecutive iterations


Conopt realizes it has some troubles and declares the solution as just feasible instead of optimal. Indeed, we can do better than an objective of 1.25. Other NLP solvers may not even know they are in trouble and just report a solution and declare it (locally) optimal. 

There is a big problem with this model. Most NLP solvers assume the objective and non-linear constraints are smooth (i.e. differentiable). In this case, we don't have this, and this lack of smoothness is often a recipe for disaster. This is a nice example of this. 

I often complain about non-differentiability in models I see on forums. Often the reaction is: "Who cares?". Indeed, often NLP solvers will deliver some solution without warning. This post is about a small problem, where we really can do much better.   

Note that the initial point also plays a role here. With a different starting point, you may find different solutions.

A global NLP solver may help for this model. Some of the ones I tried, don't have support for the max() function, so we need to reformulate the model. When doing that, we can actually do a bit better and form a linear programming problem.

Wednesday, February 3, 2021

Dijkstra confusion

 Dijkstra is a famous algorithm for the shortest path problem. The original paper is from 1959 [1]:


(The format of this paper is interesting: almost no mathematical notation).  Dijkstra's algorithm is not suited for all shortest path problems. Negative weights or lengths can cause problems. There are basically three different reasons I see mentioned:

  1. Dijkstra just does not work correctly if there are negative weights.
  2. Dijkstra does not work if there are negative cycles in the graph. This would imply that a problem with negative weights would work as long as there are no negative cycles. 
  3. Keep things vague and mumble a bit about negative weights and negative cycles.

I notice that option 3 seems very popular. As a result, looking at questions about this on stackexchange, I see there is much confusion about this. Also, I see vague, misleading messages in documentation and error/warning messages. For instance, let's have a look at [2]:

Also, this routine does not work for graphs with negative distances. Negative distances can lead to infinite cycles that must be handled by specialized algorithms such as Bellman-Ford’s algorithm or Johnson’s algorithm.

Well, this is indeed sufficiently vague. We deal with option 3: we don't know if this is referring to reason 1 or 2. The same function that this documentation is about can produce the following warning:

UserWarning: Graph has negative weights: dijkstra will give inaccurate results if the graph contains negative cycles. Consider johnson or bellman_ford. distmat,pred = scipy.sparse.csgraph.dijkstra(spmat,indices=source,return_predecessors=True)

This seems to say we only need to worry about graphs with negative cycles. So that would be option 2. 


A small counter example


In [3] a small acyclic graph (a DAG: Directed Acyclic Graph) is presented:


When we try to find the shortest path from n0 to n2, using [2], we can see:

dijkstra
shortest path length: 1.0
johnson
shortest path length: -8.0
bellman_ford
shortest path length: -8.0
<ipython-input-25-7a1de3894d98>:13: UserWarning: Graph has negative weights: dijkstra will give inaccurate results if the graph contains negative cycles. Consider johnson or bellman_ford.
  distmat,pred = scipy.sparse.csgraph.dijkstra(spmat,indices=source,return_predecessors=True)


This example disproves that we just have to watch out for negative cycles.  That means the documentation and warning messages should state clearly that the function scipy.sparse.csgraph.dijkstra does not work correctly with negative weights, irrespective of whether there are negative cycles.