Wednesday, January 19, 2022

An All-Paths Network Problem

In [1] a graph problem is proposed:

  • Given an undirected network,
  • Find all paths between two nodes,
  • Such that no arc is used more than once,
  • And the path does not contain more than \(M\) arcs
Notice that nodes can be visited several times. I thought this could be a good job for a solution pool approach. 

Preliminaries


As I am more familiar with directed graphs, let's start with setting up a small random, sparse, directed graph. 

I start with 10 nodes and say there is a 20% probability that a link exists between two nodes, \(i \rightarrow j\). So the data set looks like:


----     30 SET n  nodes

 n1 ,    n2 ,    n3 ,    n4 ,    n5 ,    n6 ,    n7 ,    n8 ,    n9 ,    n10 

----     30 SET a  directed arcs

            n1          n2          n3          n4          n5          n6

n1                     YES                     YES
n2                                                                     YES
n3                                 YES
n4                                 YES         YES                     YES
n6                                                         YES
n7         YES                                             YES
n8                                 YES
n9         YES

 +          n7          n8          n9         n10

n1                     YES
n2                     YES
n3                     YES         YES
n4         YES
n5                                 YES
n6                                             YES
n8                                             YES
n9         YES                                 YES



Using \(n_1\) as the start node and \(n_{10}\) as the end node, we can visualize this as:





Note that we have some self-loops in this network (e.g. \(n_4 \rightarrow n_4\)) and also some longer cycles (\(n_8 \rightarrow n_3 \rightarrow n_8\)). For something like the shortest path problem, this is not an issue (we will never choose such a detour as long as we have no negative lengths involved). We will find out if this has implications on the original problem. 

A shortest path LP can be stated as:

Shortest Path Model
\[\begin{align}\min&\sum_{i,j|a(i,j)}\color{darkred}f_{i,j}\\ & \sum_{i|a(i,n)} \color{darkred}f_{i,n} + \color{darkblue}{\mathit{inflow}}_n =  \sum_{j|a(n,j)}\color{darkred}f_{n,j} &&\forall n\\ & \color{darkred}f_{i,j} \in [0,1] \end{align}\]

where \[ \color{darkblue}{\mathit{inflow}}_n = \begin{cases} 1 & \text{if $n=n_1$}\\ -1 & \text{if $n=n_{10}$} \\ 0 & \text{otherwise}\end{cases}\] When we run this model, we will see:

----     82 VARIABLE count.L               =        2.000  length of shortest path (hops)

----     82 VARIABLE f.L  flow

            n8         n10

n1       1.000
n8                   1.000



So far, so good. To use a solution pool for this model we need to make a proper MIP model. I.e. variables \(\color{darkred}f_{i,j}\) become binary variables instead of continuous variables between 0 and 1: \(\color{darkred}f_{i,j} \in \{0,1\}\) . When we use the solution pool to find all solution with a maximum length \(M=3\), we see:


----    127 PARAMETER fsol  solution from solution pool

INDEX 1 = soln_sp_p1

            n8         n10

n1           1
n8                       1

INDEX 1 = soln_sp_p2

            n4          n8         n10

n1                       1
n4           1
n8                                   1

INDEX 1 = soln_sp_p3

            n2          n8         n10

n1           1
n2                       1
n8                                   1

INDEX 1 = soln_sp_p4

            n4          n6         n10

n1           1
n4                       1
n6                                   1

INDEX 1 = soln_sp_p5

            n2          n6         n10

n1           1
n2                       1
n6                                   1

INDEX 1 = soln_sp_p6

            n3          n8         n10

n1                       1
n3           1
n8                                   1



The first solution is just the shortest path solution. The second solution is not what we would expect: it is our shortest path solution plus an isolated tour: \(n_4 \rightarrow n_4\). Further down we see a similar case with a self-loop for node \(n_3\). Let's see if we can prevent this behavior by borrowing subtour elimination constraints from TSP (Traveling Salesman Problem) models [2].

All-Paths Model V1


Our all-paths model using MTZ (Miller-Tucker-Zemlin) subtour-elimination constraint can look like:


SP + MTZ Model
 \[\begin{align} \min\>&\color{darkred}z \\ &\color{darkred}z= \sum_{i,j|a(i,j)}\color{darkred}f_{i,j} \\ &\color{darkred}z \le \color{darkblue}M\\ & \sum_{i|a(i,n)} \color{darkred}f_{i,n} + \color{darkblue}{\mathit{inflow}}_n =  \sum_{j|a(n,j)}\color{darkred}f_{n,j} &&\forall n \\ & \color{darkred}t_j \ge \color{darkred}t_i + 1 + (\color{darkblue}N-1)(\color{darkred}f_{i,j}-1) && \forall a_{i,j}, i\ne n_1,j \ne n_{10}  \\ & \color{darkred}f_{i,j} \in \{0,1\} \\ & \color{darkred}t_n\ge 0 \end{align}\]


Remember, we let this model loose on the solution pool, so the bound on \(\color{darkred}z\) is significant. When we run this with \(M=3\) we see:


----    125 PARAMETER fsol  solution from solution pool 

INDEX 1 = soln_sp2_p1

            n8         n10

n1           1
n8                       1

INDEX 1 = soln_sp2_p2

            n2          n6         n10

n1           1
n2                       1
n6                                   1

INDEX 1 = soln_sp2_p3

            n2          n8         n10

n1           1
n2                       1
n8                                   1

INDEX 1 = soln_sp2_p4

            n4          n6         n10

n1           1
n4                       1
n6                                   1



This looks much better. However, we are not there yet.

When we increase \(M\) to 4, we expect to see some of these self-loops to be used. However, this does not happen. The reason is that the MTZ formulation puts an ordering on the nodes. This means we cannot revisit a node. So how do we deal with this?

All-Paths Model V2


One idea that comes to mind is to create a new graph with the original arcs now functioning as nodes. E.g. a node can be \(n_1 \text{-} n_2\) or \(n_2 \text{-} n_8\) . Details:
  • I added nodes: \(src \text{-} n_1\) or \(n_{10} \text{-} snk\).
  • The arcs are indicating adjacency. I.e. we have an arc \(n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_8\).
  • I removed self-loops such as : \(n_4 \text{-} n_4 \rightarrow n_4 \text{-} n_4\). They will not be used. (I could have left them in). 
  • A picture of the new graph is added as an appending (it is relatively large).
  • As you can see in the augmented graph each node corresponds to an original arc. Each arc is a particular case of an original node. For instance: an arc related to \(n_2\) occurs two times: \(n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_6\) and \(n_1 \text{-} n_2 \rightarrow n_2 \text{-} n_8\) .

The model will look like:



SP + MTZ Model
 \[\begin{align} \min\>&\color{darkred}z \\ &\color{darkred}z= \sum_{i,j,p,q|a(i,j,p,q)}\color{darkred}f_{i,j,p,q} \\ &\color{darkred}z \le \color{darkblue}M+1\\ & \sum_{i,j|a(i,j,p,q)} \color{darkred}f_{i,j,p,q} + \color{darkblue}{\mathit{inflow}}_{p,q} =  \sum_{i,j|a(p,q,i,j)}\color{darkred}f_{p,q,i,j} &&\forall p,q \\ & \color{darkred}t_{p,q} \ge \color{darkred}t_{i,j} + 1 + (\color{darkblue}{N}-1)(\color{darkred}f_{i,j,p,q}-1) && \forall a_{i,j,p,q}, \text{except src and snk}  \\ & \color{darkred}f_{i,j,p,q} \in \{0,1\} \\ & \color{darkred}t_{i,j}\ge 0 \end{align}\]




----    206 PARAMETER fsoln  solution from solution pool

INDEX 1 = soln_sp3_p1

              n1.n8      n8.n10     n10.snk

n1 .n8                        1
n8 .n10                                   1
src.n1            1

INDEX 1 = soln_sp3_p2

              n1.n2       n2.n8      n8.n10     n10.snk

n1 .n2                        1
n2 .n8                                    1
n8 .n10                                               1
src.n1            1

INDEX 1 = soln_sp3_p3

              n1.n4       n4.n6      n6.n10     n10.snk

n1 .n4                        1
n4 .n6                                    1
n6 .n10                                               1
src.n1            1

INDEX 1 = soln_sp3_p4

              n1.n2       n2.n6      n6.n10     n10.snk

n1 .n2                        1
n2 .n6                                    1
n6 .n10                                               1
src.n1            1

INDEX 1 = soln_sp3_p5

              n1.n8       n3.n8       n8.n3      n8.n10     n10.snk

n1 .n8                                    1
n3 .n8                                                1
n8 .n3                        1
n8 .n10                                                           1
src.n1            1

INDEX 1 = soln_sp3_p6

              n1.n8       n3.n9       n8.n3      n9.n10     n10.snk

n1 .n8                                    1
n3 .n9                                                1
n8 .n3                        1
n9 .n10                                                           1
src.n1            1

INDEX 1 = soln_sp3_p7

              n1.n4       n3.n9       n4.n3      n9.n10     n10.snk

n1 .n4                                    1
n3 .n9                                                1
n4 .n3                        1
n9 .n10                                                           1
src.n1            1

INDEX 1 = soln_sp3_p8

              n1.n4       n3.n8       n4.n3      n8.n10     n10.snk

n1 .n4                                    1
n3 .n8                                                1
n4 .n3                        1
n8 .n10                                                           1
src.n1            1

INDEX 1 = soln_sp3_p9

              n1.n4       n4.n4       n4.n6      n6.n10     n10.snk

n1 .n4                        1
n4 .n4                                    1
n4 .n6                                                1
n6 .n10                                                           1
src.n1            1



Here we see that indeed the self-loop \(n_4\rightarrow n_4\) from the original graph is used.

The final task is to make the graph undirected. We can adapt the models for this, or, maybe easier, just duplicate all arcs (except the self-loops).


Conclusion


This was a bit more complicated than I expected. On the other hand, this allowed me to play with a graph transformation I don't use every day: convert edges to vertices. Combining this with a Miller-Tucker-Zemlin subtour-elimination constraint makes this an interesting model (at least from my perspective).

References


  1. Given an undirected graph, what is the optimal way to find all paths between node A and node B given a maximum amount of arcs? (in python), https://stackoverflow.com/questions/70684721/given-an-undirected-graph-what-is-the-optimal-way-to-find-all-paths-between-nod
  2. Longest path problem,  https://yetanothermathprogrammingconsultant.blogspot.com/2020/02/longest-path-problem.html

Appendix A: augmented graph




Appendix B: GAMS model


$ontext

 
Original question

     
undirected graph
     
all paths from n1->n2 with max hop count <= M
     
each path cannot reuse an arc

 
I use a directed graph here for convenience.

$offtext

*-----------------------------------------------------------
* network topology
*-----------------------------------------------------------

option seed=1;

set
   n
'nodes' /n1*n10/
   a(n,n)
'directed arcs'
;

alias (n,i,j);

a(i,j) = uniform(0,1)<0.2;

scalar M 'maximum hops' /3/;

display n,a,M;


singleton sets
   src(n)
'source' /n1/
   snk(n)
'sink'   /n10/
;
set ni(n) 'interior nodes';
ni(n) =
yes;
ni(src) =
no;
ni(snk) =
no;


*-----------------------------------------------------------
* visualization using GraphViz
*-----------------------------------------------------------

file gv1 /graph1.gv/;
put gv1;
put "digraph neato{"/;
put "node [shape=circle,style=filled,color=yellowgreen] ",src.tl/;
put "node [shape=circle,style=filled,color=orange] ",snk.tl/;
put "node [shape=circle,style=filled,color=lightblue]"/;
loop(ni, put ni.tl:0," ";); put /;
put "edge [color=grey]"/;
loop(a(i,j), put i.tl:0,"->",j.tl:0/;);
putclose"}"/;


*-----------------------------------------------------------
*   shortest path n1->n2 as MIP/RMIP model
*-----------------------------------------------------------

parameter inflow(n) 'exogenous inflow';
inflow(src) = 1;
inflow(snk) = -1;

binary variable f(i,j) 'flow';
variable count 'length of shortest path (hops)';

equations
    obj          
'objective'
    flowbal(n)   
'flow balance at node'
;

obj.. count =e=
sum(a,f(a));
flowbal(n).. 
sum(a(i,n),f(a)) + inflow(n) =e= sum(a(n,j),f(a));

model sp /all/;
solve sp minimizing count using rmip;

display "#### Model SP: Single Shortest Path Solution", count.l,f.l;


*-----------------------------------------------------------
*   add hop count
*   and solve with solution pool
*-----------------------------------------------------------

count.up = M;

sp.optfile=1;
solve sp minimizing count using mip;

set
  solIndex
/soln_sp_p1*soln_sp_p20
           
soln_sp2_p1*soln_sp2_p20
           
soln_sp3_p1*soln_sp3_p20/
  sol(solIndex)
;
parameter fsol(solIndex,n,n) 'solution from solution pool';

execute_load "solutions.gdx",sol=index,fsol=f;
option fsol:0:1:1;
display "#### Model SP/solution pool: Find all feasible paths with max length",
         count.up,sol,fsol,
       
"#### WARNING: This solution contains subtours";

*-----------------------------------------------------------
*   add MTZ subtour elimination constraints
*-----------------------------------------------------------

positive variable t(i) 'ordering of hops';

equation mtz(i,j) 'subtour elimination constraints';

mtz(a(i,j))$(
not src(i) and not snk(j))..
    t(j) =g= t(i) + 1 +
card(n)*(f(i,j)-1);

model sp2 /sp+mtz/;
sp2.optfile=1;
solve sp2 minimizing count using mip;

execute_load "solutions.gdx",sol=index,fsol=f;
display "#### Model SP2/solution pool: Find all feasible paths with max length",
         sol,fsol,
       
"#### WARNING: This solution excludes tours";


*-----------------------------------------------------------
*  make arcs nodes, and arcs indicate adjacency
*-----------------------------------------------------------

set in(*)  'references to old nodes while adding src/snk' /src,snk/;
in(n) =
yes;
alias(in,jn,kn);

set nn(*,*) 'new nodes' /src.n1, n10.snk/;
nn(a) =
yes;
display nn;
alias (nn,nn2);

set an(*,*,*,*) 'new arcs (w/o self-loops)';
an(in,jn,jn,kn)$(nn(in,jn)
and nn(jn,kn)
               
and not (sameas(in,jn) and sameas(jn,kn))) = yes;
option an:0:2:2;
display an;


*-----------------------------------------------------------
* visualization using GraphViz
*-----------------------------------------------------------

file gv2 /graph2.gv/;
put gv2;
put "digraph neato{"/;
put "node [shape=circle,style=filled,color=yellowgreen] ";
loop(nn('src',in), put 'src_':0,in.tl:0/ );
put "node [shape=circle,style=filled,color=orange] ";
loop(nn(in,'snk'), put in.tl:0,'_snk':0/ );
put "node [shape=circle,style=filled,color=lightblue]"/;
loop(nn(in,jn)$(not sameas(in,'src') and not sameas(jn,'snk')),
   
put in.tl:0,"_":0,jn.tl:0," ":0;);
put /;
put "edge [color=grey]"/;
loop(an(in,jn,jn,kn),
  
put in.tl:0,"_":0,jn.tl:0,"->":0,jn.tl:0,"_",kn.tl:0/);
putclose"}"/;

*-----------------------------------------------------------
* Model v2
*-----------------------------------------------------------

M = 4;

singleton sets
   srcn(*,*)
/ src.n1 /
   snkn(*,*)
/n10.snk /
;

parameter inflown(*,*) 'new inflow';
inflown(srcn) = 1;
inflown(snkn) = -1;

binary variables fn 'new flow variables';
positive variable tn 'MTZ variable';

equations
    objn    
'new objective'
    flowbaln
'new flow balance on nodes'
    mtzn    
'new MTZ subtour elimination constraints'
;

objn.. count =e=
sum(an,fn(an));
flowbaln(nn)..
sum(an(nn2,nn),fn(an)) + inflown(nn) =e= sum(an(nn,nn2),fn(an));
mtzn(an(nn,nn2))$(
not srcn(nn) and not snkn(nn2))..
    tn(nn2) =g= tn(nn) + 1 +
card(nn)*(fn(nn,nn2)-1);
count.up = M+1;

model sp3 /objn,flowbaln,mtzn/;
sp3.optfile=1;
solve sp3 minimizing count using mip;

parameter fsoln(solIndex,*,*,*,*) 'solution from solution pool';

execute_load "solutions.gdx",sol=index,fsoln=fn;
option fsoln:0:2:2;
display "#### Model SP3/solution pool: augmented graph",
         count.up,sol,fsoln;


$onecho > cplex.opt
SolnPoolIntensity = 4
PopulateLim = 10000
SolnPoolPop = 2
solnpoolmerge solutions.gdx
$offecho

7 comments:

  1. A model for the same problem in MiniZinc that uses the pre-defined global constraint bounded_path (https://www.minizinc.org/doc-2.5.5/en/lib-globals.html#index-127): https://gist.github.com/zayenz/c4a779fa0e3a087c997b8962d7e0e5d5 By setting the solver configuration to find all solutions, it finds 19 paths of length at most 4. See statistics in the Gist.

    Note that this model will not use self-loops either, as it uses the definition of a path that a node can only be visited once.

    ReplyDelete
    Replies
    1. Are you using a the undirected version of the graph ? the last path shown in the Gist is not feasible in the directed version (which is why you have 19 paths and not 9 I suppose)

      Delete
  2. How about sequentially solving the shortest path model with 1/ binary variables 2/ a constraint to limit the number of arcs 3/ no good cuts to exclude previous solutions ?

    ReplyDelete
    Replies
    1. That should be essentially the same as using the solution pool.

      Delete
    2. I was thinking that perhaps with this approach the MTZ constraints would not be necessary, as the no good cuts would exclude the "first shortest path + an isolated tour".

      Delete
    3. At some stage the shortest path model may introduce subtours just to be different.

      Delete
    4. Not necessarily if we use the right no good cuts: https://or.stackexchange.com/questions/7966/how-to-compute-all-paths-between-two-given-nodes-in-a-network?noredirect=1#comment16869_7966

      Delete