Yet Another Math Programming Consultant: An All-Paths Network Problem

In [1] a graph problem is proposed:

Given an undirected network,
Find all paths between two nodes,
Such that no arc is used more than once,
And the path does not contain more than $M$ arcs

Notice that nodes can be visited several times. I thought this could be a good job for a solution pool approach.

Preliminaries

As I am more familiar with directed graphs, let's start with setting up a small random, sparse, directed graph.

I start with 10 nodes and say there is a 20% probability that a link exists between two nodes, $i \rightarrow j$. So the data set looks like:

---- 30 SET n nodes

n1 , n2 , n3 , n4 , n5 , n6 , n7 , n8 , n9 , n10

---- 30 SET a directed arcs

n1 n2 n3 n4 n5 n6

n1                     YES                     YES
n2                                                                     YES
n3                                 YES
n4                                 YES         YES                     YES
n6                                                         YES
n7         YES                                             YES
n8                                 YES
n9         YES

+ n7 n8 n9 n10

n1                     YES
n2                     YES
n3                     YES         YES
n4         YES
n5                                 YES
n6                                             YES
n8                                             YES
n9         YES                                 YES

Using $n_1$ as the start node and $n_{10}$ as the end node, we can visualize this as:

Note that we have some self-loops in this network (e.g. $n_4 \rightarrow n_4$) and also some longer cycles ($n_8 \rightarrow n_3 \rightarrow n_8$). For something like the shortest path problem, this is not an issue (we will never choose such a detour as long as we have no negative lengths involved). We will find out if this has implications on the original problem.

A shortest path LP can be stated as:

Shortest Path Model
\[\begin{align}\min&\sum_{i,j\|a(i,j)}\color{darkred}f_{i,j}\\ & \sum_{i\|a(i,n)} \color{darkred}f_{i,n} + \color{darkblue}{\mathit{inflow}}_n = \sum_{j\|a(n,j)}\color{darkred}f_{n,j} &&\forall n\\ & \color{darkred}f_{i,j} \in [0,1] \end{align}\]

where \[ \color{darkblue}{\mathit{inflow}}_n = \begin{cases} 1 & \text{if $n=n_1$}\\ -1 & \text{if $n=n_{10}$} \\ 0 & \text{otherwise}\end{cases}\] When we run this model, we will see:

---- 82 VARIABLE count.L = 2.000 length of shortest path (hops)

---- 82 VARIABLE f.L flow

n8 n10

n1 1.000
n8 1.000

So far, so good. To use a solution pool for this model we need to make a proper MIP model. I.e. variables $\color{darkred}f_{i,j}$ become binary variables instead of continuous variables between 0 and 1: $\color{darkred}f_{i,j} \in \{0,1\}$ . When we use the solution pool to find all solution with a maximum length $M=3$, we see:

---- 127 PARAMETER fsol solution from solution pool

INDEX 1 = soln_sp_p1

n8 n10

n1 1
n8 1

INDEX 1 = soln_sp_p2

n4 n8 n10

n1                       1
n4           1
n8                                   1

INDEX 1 = soln_sp_p3

n2 n8 n10

n1           1
n2                       1
n8                                   1

INDEX 1 = soln_sp_p4

n4 n6 n10

n1           1
n4                       1
n6                                   1

INDEX 1 = soln_sp_p5

n2 n6 n10

n1           1
n2                       1
n6                                   1

INDEX 1 = soln_sp_p6

n3 n8 n10

n1                       1
n3           1
n8                                   1

The first solution is just the shortest path solution. The second solution is not what we would expect: it is our shortest path solution plus an isolated tour: $n_4 \rightarrow n_4$. Further down we see a similar case with a self-loop for node $n_3$. Let's see if we can prevent this behavior by borrowing subtour elimination constraints from TSP (Traveling Salesman Problem) models [2].

All-Paths Model V1

Our all-paths model using MTZ (Miller-Tucker-Zemlin) subtour-elimination constraint can look like:

SP + MTZ Model
\[\begin{align} \min\>&\color{darkred}z \\ &\color{darkred}z= \sum_{i,j\|a(i,j)}\color{darkred}f_{i,j} \\ &\color{darkred}z \le \color{darkblue}M\\ & \sum_{i\|a(i,n)} \color{darkred}f_{i,n} + \color{darkblue}{\mathit{inflow}}_n = \sum_{j\|a(n,j)}\color{darkred}f_{n,j} &&\forall n \\ & \color{darkred}t_j \ge \color{darkred}t_i + 1 + (\color{darkblue}N-1)(\color{darkred}f_{i,j}-1) && \forall a_{i,j}, i\ne n_1,j \ne n_{10} \\ & \color{darkred}f_{i,j} \in \{0,1\} \\ & \color{darkred}t_n\ge 0 \end{align}\]

SP + MTZ Model

\[\begin{align} \min\>&\color{darkred}z \\ &\color{darkred}z= \sum_{i,j|a(i,j)}\color{darkred}f_{i,j} \\ &\color{darkred}z \le \color{darkblue}M\\ & \sum_{i|a(i,n)} \color{darkred}f_{i,n} + \color{darkblue}{\mathit{inflow}}_n = \sum_{j|a(n,j)}\color{darkred}f_{n,j} &&\forall n \\ & \color{darkred}t_j \ge \color{darkred}t_i + 1 + (\color{darkblue}N-1)(\color{darkred}f_{i,j}-1) && \forall a_{i,j}, i\ne n_1,j \ne n_{10} \\ & \color{darkred}f_{i,j} \in \{0,1\} \\ & \color{darkred}t_n\ge 0 \end{align}\]

Remember, we let this model loose on the solution pool, so the bound on $\color{darkred}z$ is significant. When we run this with $M=3$ we see:

---- 125 PARAMETER fsol solution from solution pool

INDEX 1 = soln_sp2_p1

n8 n10

n1 1
n8 1

INDEX 1 = soln_sp2_p2

n2 n6 n10

n1           1
n2                       1
n6                                   1

INDEX 1 = soln_sp2_p3

n2 n8 n10

n1           1
n2                       1
n8                                   1

INDEX 1 = soln_sp2_p4

n4 n6 n10

n1           1
n4                       1
n6                                   1

This looks much better. However, we are not there yet.

When we increase $M$ to 4, we expect to see some of these self-loops to be used. However, this does not happen. The reason is that the MTZ formulation puts an ordering on the nodes. This means we cannot revisit a node. So how do we deal with this?

All-Paths Model V2

One idea that comes to mind is to create a new graph with the original arcs now functioning as nodes. E.g. a node can be $n_1 \text{-} n_2$ or $n_2 \text{-} n_8$ . Details:

I added nodes: $src \text{-} n_1$ or $n_{10} \text{-} snk$.
The arcs are indicating adjacency. I.e. we have an arc $n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_8$.
I removed self-loops such as : $n_4 \text{-} n_4 \rightarrow n_4 \text{-} n_4$. They will not be used. (I could have left them in).
A picture of the new graph is added as an appending (it is relatively large).
As you can see in the augmented graph each node corresponds to an original arc. Each arc is a particular case of an original node. For instance: an arc related to $n_2$ occurs two times: $n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_6$ and $n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_8$ .

The model will look like:

SP + MTZ Model
\[\begin{align} \min\>&\color{darkred}z \\ &\color{darkred}z= \sum_{i,j,p,q\|a(i,j,p,q)}\color{darkred}f_{i,j,p,q} \\ &\color{darkred}z \le \color{darkblue}M+1\\ & \sum_{i,j\|a(i,j,p,q)} \color{darkred}f_{i,j,p,q} + \color{darkblue}{\mathit{inflow}}_{p,q} = \sum_{i,j\|a(p,q,i,j)}\color{darkred}f_{p,q,i,j} &&\forall p,q \\ & \color{darkred}t_{p,q} \ge \color{darkred}t_{i,j} + 1 + (\color{darkblue}{N}-1)(\color{darkred}f_{i,j,p,q}-1) && \forall a_{i,j,p,q}, \text{except src and snk} \\ & \color{darkred}f_{i,j,p,q} \in \{0,1\} \\ & \color{darkred}t_{i,j}\ge 0 \end{align}\]

SP + MTZ Model

\[\begin{align} \min\>&\color{darkred}z \\ &\color{darkred}z= \sum_{i,j,p,q|a(i,j,p,q)}\color{darkred}f_{i,j,p,q} \\ &\color{darkred}z \le \color{darkblue}M+1\\ & \sum_{i,j|a(i,j,p,q)} \color{darkred}f_{i,j,p,q} + \color{darkblue}{\mathit{inflow}}_{p,q} = \sum_{i,j|a(p,q,i,j)}\color{darkred}f_{p,q,i,j} &&\forall p,q \\ & \color{darkred}t_{p,q} \ge \color{darkred}t_{i,j} + 1 + (\color{darkblue}{N}-1)(\color{darkred}f_{i,j,p,q}-1) && \forall a_{i,j,p,q}, \text{except src and snk} \\ & \color{darkred}f_{i,j,p,q} \in \{0,1\} \\ & \color{darkred}t_{i,j}\ge 0 \end{align}\]

---- 206 PARAMETER fsoln solution from solution pool

INDEX 1 = soln_sp3_p1

n1.n8 n8.n10 n10.snk

n1 .n8                        1
n8 .n10                                   1
src.n1            1

INDEX 1 = soln_sp3_p2

n1.n2 n2.n8 n8.n10 n10.snk

n1 .n2                        1
n2 .n8                                    1
n8 .n10                                               1
src.n1            1

INDEX 1 = soln_sp3_p3

n1.n4 n4.n6 n6.n10 n10.snk

n1 .n4                        1
n4 .n6                                    1
n6 .n10                                               1
src.n1            1

INDEX 1 = soln_sp3_p4

n1.n2 n2.n6 n6.n10 n10.snk

n1 .n2                        1
n2 .n6                                    1
n6 .n10                                               1
src.n1            1

INDEX 1 = soln_sp3_p5

n1.n8 n3.n8 n8.n3 n8.n10 n10.snk

n1 .n8                                    1
n3 .n8                                                1
n8 .n3                        1
n8 .n10                                                           1
src.n1            1

INDEX 1 = soln_sp3_p6

n1.n8 n3.n9 n8.n3 n9.n10 n10.snk

n1 .n8                                    1
n3 .n9                                                1
n8 .n3                        1
n9 .n10                                                           1
src.n1            1

INDEX 1 = soln_sp3_p7

n1.n4 n3.n9 n4.n3 n9.n10 n10.snk

n1 .n4                                    1
n3 .n9                                                1
n4 .n3                        1
n9 .n10                                                           1
src.n1            1

INDEX 1 = soln_sp3_p8

n1.n4 n3.n8 n4.n3 n8.n10 n10.snk

n1 .n4                                    1
n3 .n8                                                1
n4 .n3                        1
n8 .n10                                                           1
src.n1            1

INDEX 1 = soln_sp3_p9

n1.n4 n4.n4 n4.n6 n6.n10 n10.snk

n1 .n4                        1
n4 .n4                                    1
n4 .n6                                                1
n6 .n10                                                           1
src.n1            1

Here we see that indeed the self-loop $n_4\rightarrow n_4$ from the original graph is used.

The final task is to make the graph undirected. We can adapt the models for this, or, maybe easier, just duplicate all arcs (except the self-loops).

Conclusion

This was a bit more complicated than I expected. On the other hand, this allowed me to play with a graph transformation I don't use every day: convert edges to vertices. Combining this with a Miller-Tucker-Zemlin subtour-elimination constraint makes this an interesting model (at least from my perspective).

References

Given an undirected graph, what is the optimal way to find all paths between node A and node B given a maximum amount of arcs? (in python), https://stackoverflow.com/questions/70684721/given-an-undirected-graph-what-is-the-optimal-way-to-find-all-paths-between-nod
Longest path problem, https://yetanothermathprogrammingconsultant.blogspot.com/2020/02/longest-path-problem.html

Appendix A: augmented graph

Appendix B: GAMS model

$ontext

Original question

      undirected graph
      all paths from n1->n2 with max hop count <= M
      each path cannot reuse an arc

I use a directed graph here for convenience.

$offtext

*-----------------------------------------------------------
* network topology
*-----------------------------------------------------------

option seed=1;

set
   n 'nodes' /n1*n10/
   a(n,n) 'directed arcs'
;

alias (n,i,j);

a(i,j) = uniform(0,1)<0.2;

scalar M 'maximum hops' /3/;

display n,a,M;

singleton sets
   src(n) 'source' /n1/
   snk(n) 'sink'   /n10/
;
set ni(n) 'interior nodes';
ni(n) = yes;
ni(src) = no;
ni(snk) = no;

*-----------------------------------------------------------
* visualization using GraphViz
*-----------------------------------------------------------

file gv1 /graph1.gv/;
put gv1;
put "digraph neato{"/;
put "node [shape=circle,style=filled,color=yellowgreen] ",src.tl/;
put "node [shape=circle,style=filled,color=orange] ",snk.tl/;
put "node [shape=circle,style=filled,color=lightblue]"/;
loop(ni, put ni.tl:0," ";); put /;
put "edge [color=grey]"/;
loop(a(i,j), put i.tl:0,"->",j.tl:0/;);
putclose"}"/;

*-----------------------------------------------------------
*   shortest path n1->n2 as MIP/RMIP model
*-----------------------------------------------------------

parameter inflow(n) 'exogenous inflow';
inflow(src) = 1;
inflow(snk) = -1;

binary variable f(i,j) 'flow';
variable count 'length of shortest path (hops)';

equations
    obj           'objective'
    flowbal(n)    'flow balance at node'
;

obj.. count =e= sum(a,f(a));
flowbal(n).. sum(a(i,n),f(a)) + inflow(n) =e= sum(a(n,j),f(a));

model sp /all/;
solve sp minimizing count using rmip;

display "#### Model SP: Single Shortest Path Solution", count.l,f.l;

*-----------------------------------------------------------
*   add hop count
*   and solve with solution pool
*-----------------------------------------------------------

count.up = M;

sp.optfile=1;
solve sp minimizing count using mip;

set
solIndex /soln_sp_p1*soln_sp_p20
            soln_sp2_p1*soln_sp2_p20
            soln_sp3_p1*soln_sp3_p20/
sol(solIndex)
;
parameter fsol(solIndex,n,n) 'solution from solution pool';

execute_load "solutions.gdx",sol=index,fsol=f;
option fsol:0:1:1;
display "#### Model SP/solution pool: Find all feasible paths with max length",
         count.up,sol,fsol,
        "#### WARNING: This solution contains subtours";

*-----------------------------------------------------------
*   add MTZ subtour elimination constraints
*-----------------------------------------------------------

positive variable t(i) 'ordering of hops';

equation mtz(i,j) 'subtour elimination constraints';

mtz(a(i,j))$(not src(i) and not snk(j))..
    t(j) =g= t(i) + 1 + card(n)*(f(i,j)-1);

model sp2 /sp+mtz/;
sp2.optfile=1;
solve sp2 minimizing count using mip;

execute_load "solutions.gdx",sol=index,fsol=f;
display "#### Model SP2/solution pool: Find all feasible paths with max length",
         sol,fsol,
        "#### WARNING: This solution excludes tours";

*-----------------------------------------------------------
* make arcs nodes, and arcs indicate adjacency
*-----------------------------------------------------------

set in(*) 'references to old nodes while adding src/snk' /src,snk/;
in(n) = yes;
alias(in,jn,kn);

set nn(*,*) 'new nodes' /src.n1, n10.snk/;
nn(a) = yes;
display nn;
alias (nn,nn2);

set an(*,*,*,*) 'new arcs (w/o self-loops)';
an(in,jn,jn,kn)$(nn(in,jn) and nn(jn,kn)
                and not (sameas(in,jn) and sameas(jn,kn))) = yes;
option an:0:2:2;
display an;

*-----------------------------------------------------------
* visualization using GraphViz
*-----------------------------------------------------------

file gv2 /graph2.gv/;
put gv2;
put "digraph neato{"/;
put "node [shape=circle,style=filled,color=yellowgreen] ";
loop(nn('src',in), put 'src_':0,in.tl:0/ );
put "node [shape=circle,style=filled,color=orange] ";
loop(nn(in,'snk'), put in.tl:0,'_snk':0/ );
put "node [shape=circle,style=filled,color=lightblue]"/;
loop(nn(in,jn)$(not sameas(in,'src') and not sameas(jn,'snk')),
    put in.tl:0,"_":0,jn.tl:0," ":0;);
put /;
put "edge [color=grey]"/;
loop(an(in,jn,jn,kn),
   put in.tl:0,"_":0,jn.tl:0,"->":0,jn.tl:0,"_",kn.tl:0/);
putclose"}"/;

*-----------------------------------------------------------
* Model v2
*-----------------------------------------------------------

M = 4;

singleton sets
   srcn(*,*) / src.n1 /
   snkn(*,*) /n10.snk /
;

parameter inflown(*,*) 'new inflow';
inflown(srcn) = 1;
inflown(snkn) = -1;

binary variables fn 'new flow variables';
positive variable tn 'MTZ variable';

equations
    objn     'new objective'
    flowbaln 'new flow balance on nodes'
    mtzn     'new MTZ subtour elimination constraints'
;

objn.. count =e= sum(an,fn(an));
flowbaln(nn).. sum(an(nn2,nn),fn(an)) + inflown(nn) =e= sum(an(nn,nn2),fn(an));
mtzn(an(nn,nn2))$(not srcn(nn) and not snkn(nn2))..
    tn(nn2) =g= tn(nn) + 1 + card(nn)*(fn(nn,nn2)-1);
count.up = M+1;

model sp3 /objn,flowbaln,mtzn/;
sp3.optfile=1;
solve sp3 minimizing count using mip;

parameter fsoln(solIndex,*,*,*,*) 'solution from solution pool';

execute_load "solutions.gdx",sol=index,fsoln=fn;
option fsoln:0:2:2;
display "#### Model SP3/solution pool: augmented graph",
         count.up,sol,fsoln;

$onecho > cplex.opt
SolnPoolIntensity = 4
PopulateLim = 10000
SolnPoolPop = 2
solnpoolmerge solutions.gdx
$offecho

7 comments:

MikaelJanuary 19, 2022 at 8:27 AM
A model for the same problem in MiniZinc that uses the pre-defined global constraint bounded_path (https://www.minizinc.org/doc-2.5.5/en/lib-globals.html#index-127): https://gist.github.com/zayenz/c4a779fa0e3a087c997b8962d7e0e5d5 By setting the solver configuration to find all solutions, it finds 19 paths of length at most 4. See statistics in the Gist.

Note that this model will not use self-loops either, as it uses the definition of a path that a node can only be visited once.
KuifjeFebruary 28, 2022 at 4:42 PM
How about sequentially solving the shortest path model with 1/ binary variables 2/ a constraint to limit the number of arcs 3/ no good cuts to exclude previous solutions ?

Wednesday, January 19, 2022

An All-Paths Network Problem

Preliminaries

All-Paths Model V1

All-Paths Model V2

Conclusion

References

Appendix A: augmented graph

Appendix B: GAMS model

7 comments: