Yet Another Math Programming Consultant: Shortest Path

Showing posts with label Shortest Path. Show all posts

Monday, February 24, 2025

Programming vs Modeling

I found another interesting GAMS model. Here, I wanted to demonstrate the difference between programming (implementing algorithms in a programming language) and modeling (building models and implementing them in a modeling tool). The problem is solving a shortest path problem given a sparse network with random costs. We compare three strategies:

Implement Dijkstra's shortest path algorithm in GAMS,
implement Dijkstra's shortest path algorithm in Python and
use a simple LP model.

One of the practical problems is how to represent and display the optimal shortest path in a meaningful and compact way. In some cases, this is more difficult than implementing the pure shortest path algorithm or model.

Dijkstra confusion

Dijkstra is a famous algorithm for the shortest path problem. The original paper is from 1959 [1]:

(The format of this paper is interesting: almost no mathematical notation). Dijkstra's algorithm is not suited for all shortest path problems. Negative weights or lengths can cause problems. There are basically three different reasons I see mentioned:

Dijkstra just does not work correctly if there are negative weights.
Dijkstra does not work if there are negative cycles in the graph. This would imply that a problem with negative weights would work as long as there are no negative cycles.
Keep things vague and mumble a bit about negative weights and negative cycles.

I notice that option 3 seems very popular. As a result, looking at questions about this on stackexchange, I see there is much confusion about this. Also, I see vague, misleading messages in documentation and error/warning messages. For instance, let's have a look at [2]:

Also, this routine does not work for graphs with negative distances. Negative distances can lead to infinite cycles that must be handled by specialized algorithms such as Bellman-Ford’s algorithm or Johnson’s algorithm.

Well, this is indeed sufficiently vague. We deal with option 3: we don't know if this is referring to reason 1 or 2. The same function that this documentation is about can produce the following warning:

UserWarning: Graph has negative weights: dijkstra will give inaccurate results if the graph contains negative cycles. Consider johnson or bellman_ford.
  distmat,pred = scipy.sparse.csgraph.dijkstra(spmat,indices=source,return_predecessors=True)

This seems to say we only need to worry about graphs with negative cycles. So that would be option 2.

A small counter example

In [3] a small acyclic graph (a DAG: Directed Acyclic Graph) is presented:

When we try to find the shortest path from n0 to n2, using [2], we can see:

dijkstra
shortest path length: 1.0
johnson
shortest path length: -8.0
bellman_ford
shortest path length: -8.0

<ipython-input-25-7a1de3894d98>:13: UserWarning: Graph has negative weights: dijkstra will give inaccurate results if the graph contains negative cycles. Consider johnson or bellman_ford.
  distmat,pred = scipy.sparse.csgraph.dijkstra(spmat,indices=source,return_predecessors=True)

This example disproves that we just have to watch out for negative cycles. That means the documentation and warning messages should state clearly that the function scipy.sparse.csgraph.dijkstra does not work correctly with negative weights, irrespective of whether there are negative cycles.

Assignment problem with a wrinkle formulated as a network problem

In a previous post [1] I discussed a simple problem. But it turned out not so easy to solve for some of the larger data sets. Basically, it was an assignment problem with an extra condition. The problem was a follows:

Consider two arrays $\color{darkblue}a_i$ (length $\color{darkblue}m$) and $\color{darkblue}b_j$ (length $\color{darkblue}n$) with $\color{darkblue}m \lt \color{darkblue}n$. Assign all values $\color{darkblue}a_i$ to a $\color{darkblue}b_j$ such that:

Each $\color{darkblue}b_j$ can have 0 or 1 $\color{darkblue}a_i$ assigned to it.
The assignments need to maintain the original order of $\color{darkblue}a_i$. I.e. if $\color{darkblue}a_i \rightarrow \color{darkblue}b_j$ then $\color{darkblue}a_{i+1}$ must be assigned to a slot in $\color{darkblue}b$ that is beyond slot $j$. In the picture below that means that arrows cannot cross.
Do this while minimizing the sum of the products.

In [1], I attacked this as a mixed-integer programming problem. In this post, I want to see if we can solve this as a network problem. This was largely inspired by the comments in [1].

Shortest path algorithms and models

The shortest path problem can be solved in different ways. Dijkstra's algorithm is a popular choice due to its simplicity and speed. In [2] we can read a quote by Edsger Dijkstra:

What is the shortest way to travel from Rotterdam to Groningen, in general: from given city to given city. It is the algorithm for the shortest path, which I designed in about twenty minutes. One morning I was shopping in Amsterdam with my young fiancée, and tired, we sat down on the café terrace to drink a cup of coffee and I was just thinking about whether I could do this, and I then designed the algorithm for the shortest path. As I said, it was a twenty-minute invention. In fact, it was published in '59, three years later. The publication is still readable, it is, in fact, quite nice. One of the reasons that it is so nice was that I designed it without pencil and paper. I learned later that one of the advantages of designing without pencil and paper is that you are almost forced to avoid all avoidable complexities. Eventually, that algorithm became to my great amazement, one of the cornerstones of my fame.

For an early implementation in a computer program, Dijkstra used a set of 64 nodes corresponding to Dutch cities. 64 was chosen so the node number could be encoded in 6 bits [2].

The shortest path can also be formulated as an LP problem. Standard LP solvers can handle quite large problems (each node becomes a constraint, and each arc a variable). In addition, a solver like Cplex contains a specialized network solver for such LPs.

Graph

We start with the nodes. We denote the nodes by $\color{darkblue}n_{i,j}$ representing: $\color{darkblue}a_i$ is assigned to $\color{darkblue}b_j$. Not all assignments are possible. For instance, we cannot assign $\color{darkblue}a_2$ to $\color{darkblue}b_1$. That means: node $\color{darkblue}n_{2,1}$ does not exist.

We also need a source node and a sink node.

The arcs indicate: after assigning $\color{darkblue}a_i \rightarrow \color{darkblue}b_j$ we need to assign the next item $\color{darkblue}a_{i+1}\rightarrow \color{darkblue}b_{j+k}$ for some $k\ge 1$. In addition we need to connect the source node to all nodes with $i=1$ and the sink node to all nodes with $i=3$ (our last $\color{darkblue}a_i$). So our network looks like:

Network representation

Note how any arc not connected to the sink or source node, goes to the right and downwards.

We have costs for visiting each node: $\color{darkblue}a_i\cdot\color{darkblue}b_j$. As we want to formulate this as a shortest path problem, we need to allocate these costs to arcs. I used the incoming arcs for this. So any arc $e_{i,j,i',j'}$ gets a cost $\color{darkblue}a_{i'}\cdot\color{darkblue}b_{j'}$. The arcs to the sink node have a zero cost.

Note: this graph is acyclic, so negative arc lengths are ok. We will never see negative cycles.

As you can see: because the nodes have two indices, the arcs have four! That will be fun.

Longest path problem

This is a question that regularly pops up: how to solve the longest path problem? At first sight, this is easy. Just replace the "minimize" in the shortest path problem by "maximize" and we are good to go. Unfortunately, opposed to the well-known shortest path problem, the longest path problem is not that easy to state and to solve.

1. Standard Shortest Path Formulation

The standard shortest path can be formulated as a Mixed Integer Programming problem:

Shortest Path Problem
\[\begin{align} \min& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j\|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j\|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ &\color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkblue} b_i = \begin{cases}1 & \text{if $i$ is a source node} \\ -1 & \text{if $i$ is a sink node} \\ 0 & \text{otherwise} \end{cases} \end{align} \]

Shortest Path Problem

\[\begin{align} \min& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ &\color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkblue} b_i = \begin{cases}1 & \text{if $i$ is a source node} \\ -1 & \text{if $i$ is a sink node} \\ 0 & \text{otherwise} \end{cases} \end{align} \]

where $A$ is our network and $b$ is a sparse data vector indicating the exogenous inflow or outflow at the source or sink node.

Network with the shortest path

There are some implicit assumptions in this model: there are no negative cycles. In the above example, $d_{i,j}$ represents distance, so we have $d_{i,j}\ge 0$ and there is no problem of negative cycles.

Shortest path problems are very easy MIPs to solve. The solution is automatically integer, so it can be solved as an LP and these LPs are very sparse (only two non-zero elements per column) and very easy to solve. In addition, specialized network solvers are widely available.

The problem in the picture has 20 nodes and 37 × 2 arcs. In this example, the arcs are duplicated to make sure we can go in two directions. This leads to an LP of 74 variables and 20 constraints. The solution looks like:

----     75 VARIABLE z.L                   =      128.538  objective

----     75 VARIABLE x.L  flow

            point6(B)     point15     point16     point18     point20

point5(A)                               1.000
point15         1.000
point16                                             1.000
point18                                                         1.000
point20                     1.000

2. Just use max

When we just turn the above problem in a maximization problem, we get results that are difficult to interpret. But they certainly don't form what we would call a path:

----     78 VARIABLE z.L                   =     1638.406  objective

----     78 VARIABLE x.L  flow

               point1      point2      point3      point4   point5(A)   point6(B)      point7      point8      point9

point1                                              1.000
point3                                                                                                          1.000
point4          1.000
point5(A)                                                                                           1.000
point6(B)                                                                               1.000
point7                                                                      1.000
point8                                                          1.000
point9                                  1.000
point10                     1.000
point12                                                                                                         1.000
point13                                                                                 1.000
point14                                             1.000                                           1.000
point15                                                                     1.000
point16                                                                                             1.000
point17         1.000                               1.000
point18                                 1.000                                                                   1.000
point19                                                                                 1.000
point20                     1.000

        +     point10     point11     point12     point13     point14     point15     point16     point17     point18

point1                                                                                              1.000
point2          1.000
point3                                                                                                          1.000
point4                                                          1.000                               1.000
point5(A)                                                                               1.000
point7                                              1.000
point8                                                          1.000                   1.000
point9                                  1.000                                                                   1.000
point10                                                                     1.000
point11                                                                                                         1.000
point12                                                                                 1.000                   1.000
point14                                                                                 1.000       1.000
point15         1.000
point16                                 1.000                   1.000                               1.000       1.000
point17                                                         1.000                   1.000                   1.000
point18                     1.000       1.000                                                       1.000
point19                                             1.000                   1.000                   1.000       1.000
point20         1.000       1.000                   1.000                   1.000

        +     point19     point20

point2                      1.000
point7          1.000
point10                     1.000
point11                     1.000
point13         1.000       1.000
point15         1.000
point17         1.000
point18         1.000       1.000
point19                     1.000
point20         1.000

Maximization results

Basically, all but a few lines in the plot are traveled in both directions. Conclusion: this is not what we are looking for.

3. TSP-like solution

To form a proper path (called an elementary path), we need to add two sets of constraints [1]:

The outflow of each node goes to just one arc: \[\sum_{j|(i,j)\in A} x_{i,j} \le 1\>\forall i\]
Forbid any sub-tours to be formed. Sub-tour elimination constraints are well-known from the Traveling Salesman Problem.

Using the simple MTZ (Miller, Tucker, and Zemlin) approach, we can formulate:

Longest Path Problem
\[\begin{align} \max& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j\|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j\|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ & \sum_{j\|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} \le 1 && \forall i\\ & \color{darkred}t_j \ge \color{darkred}t_i + 1 + (\color{darkblue}n-1)(\color{darkred}x_{i,j}-1) && \forall i,j\|i\ne\text{source},j\ne\text{sink} \\&\color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}t_i \ge 0 \end{align} \]

Longest Path Problem

\[\begin{align} \max& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ & \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} \le 1 && \forall i\\ & \color{darkred}t_j \ge \color{darkred}t_i + 1 + (\color{darkblue}n-1)(\color{darkred}x_{i,j}-1) && \forall i,j|i\ne\text{source},j\ne\text{sink} \\&\color{darkred}x_{i,j} \in \{0,1\} \\ & \color{darkred}t_i \ge 0 \end{align} \]

Here $n$ is the number of nodes.

The results look like:

----    126 VARIABLE z.L                   =      432.987  objective

----    126 VARIABLE x.L  flow

               point1      point2      point3      point4   point6(B)      point7      point8      point9     point10     point12

point2                                                                                                          1.000
point4          1.000
point5(A)                                                                               1.000
point9                                  1.000
point12                                                                                             1.000
point14                                             1.000
point15                                                         1.000
point16                                                                                                                     1.000
point19                                                                     1.000
point20                     1.000

        +     point13     point14     point15     point16     point17     point18     point19     point20

point1                                                          1.000
point3                                                                      1.000
point7          1.000
point8                      1.000
point10                                 1.000
point13                                                                                             1.000
point17                                             1.000
point18                                                                                 1.000

Solution of Longest Elementary Path Model

This looks much better.

More formulations can be found in [1].

4. Ad-hoc approach

Let's drop the sub-tour elimination constraints, and have a look at the solution of the remaining model:

Problem A: drop sub-tour elimination constraints
\[\begin{align} \max& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j\|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j\|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ & \sum_{j\|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} \le 1 && \forall i\\ &\color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

When we look at the solution we see a number of sub-tours of just two points. This means: travel from $C\rightarrow D$ and back.

Sub-tours with 2 points

These special sub-tours are easily prevented: add a cut of the form \[x_{i,j}+x_{j,i} \le 1\] We can add them only for the $i \rightarrow j \rightarrow i$ offenders or just for all possible $i \lt j$. Here I did the last thing. The model becomes:

Problem B: add cuts
\[\begin{align} \max& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j\|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j\|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ & \sum_{j\|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} \le 1 && \forall i\\ & \color{darkred}x_{i,j}+\color{darkred}x_{j,i} \le 1 && \forall i \lt j \\ &\color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

Problem B: add cuts

\[\begin{align} \max& \sum_{(i,j)\in\color{darkblue} A} \color{darkblue}d_{i,j} \color{darkred}x_{i,j} \\ & \sum_{j|(j,i)\in \color{darkblue}A} \color{darkred}x_{j,i} + \color{darkblue}b_i = \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} && \forall i \\ & \sum_{j|(i,j)\in \color{darkblue}A} \color{darkred}x_{i,j} \le 1 && \forall i\\ & \color{darkred}x_{i,j}+\color{darkred}x_{j,i} \le 1 && \forall i \lt j \\ &\color{darkred}x_{i,j} \in \{0,1\} \end{align} \]

After adding these cuts and solving problem B we see:

Results after adding cuts

The solution has no new sub-tours, so we can conclude this is the optimal solution.

Thus far this is a bit of an ad-hoc method. This approach would fail if we observe more complex sub-tours. However, it is not too difficult to make this more general. We can add appropriate cuts when we observe sub-tours and resolve the model. Such a cutting plane algorithm can actually outperform models where we add sub-tour elimination constraints in advance. With modern solvers, it is even possible to add this cut generation inside the branch-and-bound search (i.e. no resolve needed).

Conclusion

The conclusion is: "solve the longest path problem" is not as easy as it seems. We need to employ TSP-like machinery to solve this problem. That means no easy MIP models. Straight MIP models are both not that simple to formulate and will only work for relatively small models. A cutting plane approach can help here.

References

Leonardo Taccari, Integer programming formulations for the elementary shortest path problem, European Journal of Operational Research, Volume 252, Issue 1, 1 July 2016, Pages 122-130

Yet Another Math Programming Consultant