## Wednesday, January 19, 2022

### An All-Paths Network Problem

In [1] a graph problem is proposed:

• Given an undirected network,
• Find all paths between two nodes,
• Such that no arc is used more than once,
• And the path does not contain more than $$M$$ arcs
Notice that nodes can be visited several times. I thought this could be a good job for a solution pool approach.

#### Preliminaries

As I am more familiar with directed graphs, let's start with setting up a small random, sparse, directed graph.

I start with 10 nodes and say there is a 20% probability that a link exists between two nodes, $$i \rightarrow j$$. So the data set looks like:

 ----     30 SET n  nodes  n1 ,    n2 ,    n3 ,    n4 ,    n5 ,    n6 ,    n7 ,    n8 ,    n9 ,    n10  ----     30 SET a  directed arcs             n1          n2          n3          n4          n5          n6 n1                     YES                     YESn2                                                                     YESn3                                 YESn4                                 YES         YES                     YESn6                                                         YESn7         YES                                             YESn8                                 YESn9         YES  +          n7          n8          n9         n10 n1                     YESn2                     YESn3                     YES         YESn4         YESn5                                 YESn6                                             YESn8                                             YESn9         YES                                 YES

Using $$n_1$$ as the start node and $$n_{10}$$ as the end node, we can visualize this as:

Note that we have some self-loops in this network (e.g. $$n_4 \rightarrow n_4$$) and also some longer cycles ($$n_8 \rightarrow n_3 \rightarrow n_8$$). For something like the shortest path problem, this is not an issue (we will never choose such a detour as long as we have no negative lengths involved). We will find out if this has implications on the original problem.

A shortest path LP can be stated as:

Shortest Path Model
\begin{align}\min&\sum_{i,j|a(i,j)}\color{darkred}f_{i,j}\\ & \sum_{i|a(i,n)} \color{darkred}f_{i,n} + \color{darkblue}{\mathit{inflow}}_n = \sum_{j|a(n,j)}\color{darkred}f_{n,j} &&\forall n\\ & \color{darkred}f_{i,j} \in [0,1] \end{align}

where $\color{darkblue}{\mathit{inflow}}_n = \begin{cases} 1 & \text{if n=n_1}\\ -1 & \text{if n=n_{10}} \\ 0 & \text{otherwise}\end{cases}$ When we run this model, we will see:

 ----     82 VARIABLE count.L               =        2.000  length of shortest path (hops) ----     82 VARIABLE f.L  flow             n8         n10 n1       1.000n8                   1.000

So far, so good. To use a solution pool for this model we need to make a proper MIP model. I.e. variables $$\color{darkred}f_{i,j}$$ become binary variables instead of continuous variables between 0 and 1: $$\color{darkred}f_{i,j} \in \{0,1\}$$ . When we use the solution pool to find all solution with a maximum length $$M=3$$, we see:

 ----    127 PARAMETER fsol  solution from solution pool INDEX 1 = soln_sp_p1             n8         n10 n1           1n8                       1 INDEX 1 = soln_sp_p2             n4          n8         n10 n1                       1n4           1n8                                   1 INDEX 1 = soln_sp_p3             n2          n8         n10 n1           1n2                       1n8                                   1 INDEX 1 = soln_sp_p4             n4          n6         n10 n1           1n4                       1n6                                   1 INDEX 1 = soln_sp_p5             n2          n6         n10 n1           1n2                       1n6                                   1 INDEX 1 = soln_sp_p6             n3          n8         n10 n1                       1n3           1n8                                   1

The first solution is just the shortest path solution. The second solution is not what we would expect: it is our shortest path solution plus an isolated tour: $$n_4 \rightarrow n_4$$. Further down we see a similar case with a self-loop for node $$n_3$$. Let's see if we can prevent this behavior by borrowing subtour elimination constraints from TSP (Traveling Salesman Problem) models [2].

#### All-Paths Model V1

Our all-paths model using MTZ (Miller-Tucker-Zemlin) subtour-elimination constraint can look like:

SP + MTZ Model
\begin{align} \min\>&\color{darkred}z \\ &\color{darkred}z= \sum_{i,j|a(i,j)}\color{darkred}f_{i,j} \\ &\color{darkred}z \le \color{darkblue}M\\ & \sum_{i|a(i,n)} \color{darkred}f_{i,n} + \color{darkblue}{\mathit{inflow}}_n = \sum_{j|a(n,j)}\color{darkred}f_{n,j} &&\forall n \\ & \color{darkred}t_j \ge \color{darkred}t_i + 1 + (\color{darkblue}N-1)(\color{darkred}f_{i,j}-1) && \forall a_{i,j}, i\ne n_1,j \ne n_{10} \\ & \color{darkred}f_{i,j} \in \{0,1\} \\ & \color{darkred}t_n\ge 0 \end{align}

Remember, we let this model loose on the solution pool, so the bound on $$\color{darkred}z$$ is significant. When we run this with $$M=3$$ we see:

 ----    125 PARAMETER fsol  solution from solution pool  INDEX 1 = soln_sp2_p1             n8         n10 n1           1n8                       1 INDEX 1 = soln_sp2_p2             n2          n6         n10 n1           1n2                       1n6                                   1 INDEX 1 = soln_sp2_p3             n2          n8         n10 n1           1n2                       1n8                                   1 INDEX 1 = soln_sp2_p4             n4          n6         n10 n1           1n4                       1n6                                   1

This looks much better. However, we are not there yet.

When we increase $$M$$ to 4, we expect to see some of these self-loops to be used. However, this does not happen. The reason is that the MTZ formulation puts an ordering on the nodes. This means we cannot revisit a node. So how do we deal with this?

#### All-Paths Model V2

One idea that comes to mind is to create a new graph with the original arcs now functioning as nodes. E.g. a node can be $$n_1 \text{-} n_2$$ or $$n_2 \text{-} n_8$$ . Details:
• I added nodes: $$src \text{-} n_1$$ or $$n_{10} \text{-} snk$$.
• The arcs are indicating adjacency. I.e. we have an arc $$n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_8$$.
• I removed self-loops such as : $$n_4 \text{-} n_4 \rightarrow n_4 \text{-} n_4$$. They will not be used. (I could have left them in).
• A picture of the new graph is added as an appending (it is relatively large).
• As you can see in the augmented graph each node corresponds to an original arc. Each arc is a particular case of an original node. For instance: an arc related to $$n_2$$ occurs two times: $$n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_6$$ and $$n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_8$$ .

The model will look like:

SP + MTZ Model
\begin{align} \min\>&\color{darkred}z \\ &\color{darkred}z= \sum_{i,j,p,q|a(i,j,p,q)}\color{darkred}f_{i,j,p,q} \\ &\color{darkred}z \le \color{darkblue}M+1\\ & \sum_{i,j|a(i,j,p,q)} \color{darkred}f_{i,j,p,q} + \color{darkblue}{\mathit{inflow}}_{p,q} = \sum_{i,j|a(p,q,i,j)}\color{darkred}f_{p,q,i,j} &&\forall p,q \\ & \color{darkred}t_{p,q} \ge \color{darkred}t_{i,j} + 1 + (\color{darkblue}{N}-1)(\color{darkred}f_{i,j,p,q}-1) && \forall a_{i,j,p,q}, \text{except src and snk} \\ & \color{darkred}f_{i,j,p,q} \in \{0,1\} \\ & \color{darkred}t_{i,j}\ge 0 \end{align}

 ----    206 PARAMETER fsoln  solution from solution pool INDEX 1 = soln_sp3_p1               n1.n8      n8.n10     n10.snk n1 .n8                        1n8 .n10                                   1src.n1            1 INDEX 1 = soln_sp3_p2               n1.n2       n2.n8      n8.n10     n10.snk n1 .n2                        1n2 .n8                                    1n8 .n10                                               1src.n1            1 INDEX 1 = soln_sp3_p3               n1.n4       n4.n6      n6.n10     n10.snk n1 .n4                        1n4 .n6                                    1n6 .n10                                               1src.n1            1 INDEX 1 = soln_sp3_p4               n1.n2       n2.n6      n6.n10     n10.snk n1 .n2                        1n2 .n6                                    1n6 .n10                                               1src.n1            1 INDEX 1 = soln_sp3_p5               n1.n8       n3.n8       n8.n3      n8.n10     n10.snk n1 .n8                                    1n3 .n8                                                1n8 .n3                        1n8 .n10                                                           1src.n1            1 INDEX 1 = soln_sp3_p6               n1.n8       n3.n9       n8.n3      n9.n10     n10.snk n1 .n8                                    1n3 .n9                                                1n8 .n3                        1n9 .n10                                                           1src.n1            1 INDEX 1 = soln_sp3_p7               n1.n4       n3.n9       n4.n3      n9.n10     n10.snk n1 .n4                                    1n3 .n9                                                1n4 .n3                        1n9 .n10                                                           1src.n1            1 INDEX 1 = soln_sp3_p8               n1.n4       n3.n8       n4.n3      n8.n10     n10.snk n1 .n4                                    1n3 .n8                                                1n4 .n3                        1n8 .n10                                                           1src.n1            1 INDEX 1 = soln_sp3_p9               n1.n4       n4.n4       n4.n6      n6.n10     n10.snk n1 .n4                        1n4 .n4                                    1n4 .n6                                                1n6 .n10                                                           1src.n1            1

Here we see that indeed the self-loop $$n_4\rightarrow n_4$$ from the original graph is used.

The final task is to make the graph undirected. We can adapt the models for this, or, maybe easier, just duplicate all arcs (except the self-loops).

#### Conclusion

This was a bit more complicated than I expected. On the other hand, this allowed me to play with a graph transformation I don't use every day: convert edges to vertices. Combining this with a Miller-Tucker-Zemlin subtour-elimination constraint makes this an interesting model (at least from my perspective).

#### References

1. Given an undirected graph, what is the optimal way to find all paths between node A and node B given a maximum amount of arcs? (in python), https://stackoverflow.com/questions/70684721/given-an-undirected-graph-what-is-the-optimal-way-to-find-all-paths-between-nod
2. Longest path problem,  https://yetanothermathprogrammingconsultant.blogspot.com/2020/02/longest-path-problem.html

#### Appendix B: GAMS model

 $ontext Original question undirected graph all paths from n1->n2 with max hop count <= M each path cannot reuse an arc I use a directed graph here for convenience.$offtext *----------------------------------------------------------- * network topology *----------------------------------------------------------- option seed=1; set    n 'nodes' /n1*n10/    a(n,n) 'directed arcs' ; alias (n,i,j); a(i,j) = uniform(0,1)<0.2; scalar M 'maximum hops' /3/; display n,a,M; singleton sets    src(n) 'source' /n1/    snk(n) 'sink'   /n10/ ; set ni(n) 'interior nodes'; ni(n) = yes; ni(src) = no; ni(snk) = no; *----------------------------------------------------------- * visualization using GraphViz *----------------------------------------------------------- file gv1 /graph1.gv/; put gv1; put "digraph neato{"/; put "node [shape=circle,style=filled,color=yellowgreen] ",src.tl/; put "node [shape=circle,style=filled,color=orange] ",snk.tl/; put "node [shape=circle,style=filled,color=lightblue]"/; loop(ni, put ni.tl:0," ";); put /; put "edge [color=grey]"/; loop(a(i,j), put i.tl:0,"->",j.tl:0/;); putclose"}"/; *----------------------------------------------------------- *   shortest path n1->n2 as MIP/RMIP model *----------------------------------------------------------- parameter inflow(n) 'exogenous inflow'; inflow(src) = 1; inflow(snk) = -1; binary variable f(i,j) 'flow'; variable count 'length of shortest path (hops)'; equations     obj           'objective'     flowbal(n)    'flow balance at node' ; obj.. count =e= sum(a,f(a)); flowbal(n)..  sum(a(i,n),f(a)) + inflow(n) =e= sum(a(n,j),f(a)); model sp /all/; solve sp minimizing count using rmip; display "#### Model SP: Single Shortest Path Solution", count.l,f.l; *----------------------------------------------------------- *   add hop count *   and solve with solution pool *----------------------------------------------------------- count.up = M; sp.optfile=1; solve sp minimizing count using mip; set   solIndex /soln_sp_p1*soln_sp_p20             soln_sp2_p1*soln_sp2_p20             soln_sp3_p1*soln_sp3_p20/   sol(solIndex) ; parameter fsol(solIndex,n,n) 'solution from solution pool'; execute_load "solutions.gdx",sol=index,fsol=f; option fsol:0:1:1; display "#### Model SP/solution pool: Find all feasible paths with max length",          count.up,sol,fsol,         "#### WARNING: This solution contains subtours"; *----------------------------------------------------------- *   add MTZ subtour elimination constraints *----------------------------------------------------------- positive variable t(i) 'ordering of hops'; equation mtz(i,j) 'subtour elimination constraints'; mtz(a(i,j))$(not src(i) and not snk(j)).. t(j) =g= t(i) + 1 + card(n)*(f(i,j)-1); model sp2 /sp+mtz/; sp2.optfile=1; solve sp2 minimizing count using mip; execute_load "solutions.gdx",sol=index,fsol=f; display "#### Model SP2/solution pool: Find all feasible paths with max length", sol,fsol, "#### WARNING: This solution excludes tours"; *----------------------------------------------------------- * make arcs nodes, and arcs indicate adjacency *----------------------------------------------------------- set in(*) 'references to old nodes while adding src/snk' /src,snk/; in(n) = yes; alias(in,jn,kn); set nn(*,*) 'new nodes' /src.n1, n10.snk/; nn(a) = yes; display nn; alias (nn,nn2); set an(*,*,*,*) 'new arcs (w/o self-loops)'; an(in,jn,jn,kn)$(nn(in,jn) and nn(jn,kn)                 and not (sameas(in,jn) and sameas(jn,kn))) = yes; option an:0:2:2; display an; *----------------------------------------------------------- * visualization using GraphViz *----------------------------------------------------------- file gv2 /graph2.gv/; put gv2; put "digraph neato{"/; put "node [shape=circle,style=filled,color=yellowgreen] "; loop(nn('src',in), put 'src_':0,in.tl:0/ ); put "node [shape=circle,style=filled,color=orange] "; loop(nn(in,'snk'), put in.tl:0,'_snk':0/ ); put "node [shape=circle,style=filled,color=lightblue]"/; loop(nn(in,jn)$(not sameas(in,'src') and not sameas(jn,'snk')), put in.tl:0,"_":0,jn.tl:0," ":0;); put /; put "edge [color=grey]"/; loop(an(in,jn,jn,kn), put in.tl:0,"_":0,jn.tl:0,"->":0,jn.tl:0,"_",kn.tl:0/); putclose"}"/; *----------------------------------------------------------- * Model v2 *----------------------------------------------------------- M = 4; singleton sets srcn(*,*) / src.n1 / snkn(*,*) /n10.snk / ; parameter inflown(*,*) 'new inflow'; inflown(srcn) = 1; inflown(snkn) = -1; binary variables fn 'new flow variables'; positive variable tn 'MTZ variable'; equations objn 'new objective' flowbaln 'new flow balance on nodes' mtzn 'new MTZ subtour elimination constraints' ; objn.. count =e= sum(an,fn(an)); flowbaln(nn).. sum(an(nn2,nn),fn(an)) + inflown(nn) =e= sum(an(nn,nn2),fn(an)); mtzn(an(nn,nn2))$(not srcn(nn) and not snkn(nn2))..     tn(nn2) =g= tn(nn) + 1 + card(nn)*(fn(nn,nn2)-1); count.up = M+1; model sp3 /objn,flowbaln,mtzn/; sp3.optfile=1; solve sp3 minimizing count using mip; parameter fsoln(solIndex,*,*,*,*) 'solution from solution pool'; execute_load "solutions.gdx",sol=index,fsoln=fn; option fsoln:0:2:2; display "#### Model SP3/solution pool: augmented graph",          count.up,sol,fsoln; $onecho > cplex.opt SolnPoolIntensity = 4 PopulateLim = 10000 SolnPoolPop = 2 solnpoolmerge solutions.gdx$offecho

1. A model for the same problem in MiniZinc that uses the pre-defined global constraint bounded_path (https://www.minizinc.org/doc-2.5.5/en/lib-globals.html#index-127): https://gist.github.com/zayenz/c4a779fa0e3a087c997b8962d7e0e5d5 By setting the solver configuration to find all solutions, it finds 19 paths of length at most 4. See statistics in the Gist.

Note that this model will not use self-loops either, as it uses the definition of a path that a node can only be visited once.

1. Are you using a the undirected version of the graph ? the last path shown in the Gist is not feasible in the directed version (which is why you have 19 paths and not 9 I suppose)

2. How about sequentially solving the shortest path model with 1/ binary variables 2/ a constraint to limit the number of arcs 3/ no good cuts to exclude previous solutions ?

1. That should be essentially the same as using the solution pool.

2. I was thinking that perhaps with this approach the MTZ constraints would not be necessary, as the no good cuts would exclude the "first shortest path + an isolated tour".

3. At some stage the shortest path model may introduce subtours just to be different.

4. Not necessarily if we use the right no good cuts: https://or.stackexchange.com/questions/7966/how-to-compute-all-paths-between-two-given-nodes-in-a-network?noredirect=1#comment16869_7966