Yet Another Math Programming Consultant: A network model: Pyomo vs GAMS

Network models are an important class of optimization models, in itself but also as part of larger models. In [1] a nice presentation is given on how to set up a network model in the Python-based modeling tool Pyomo.

The simple shortest path problem is reproduced here:

from collections import defaultdict

import pyomo.environ as pe
import pyomo.opt as po

#
# Define the Graph
#
nodes = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
edges = {(0, 1), (0, 2), (0, 3), (1, 4),
         (1, 6), (2, 1), (2, 3), (2, 5),
         (3, 5), (4, 2), (5, 7), (5, 8),
         (6, 4), (6, 7), (6, 9), (7, 4),
         (7, 9), (8, 3), (8, 7), (8, 9)}
distance = {(0, 1): 40, (0, 2):  8, (0, 3): 10, (1, 4):  6,
            (1, 6): 10, (2, 1):  4, (2, 3): 12, (2, 5):  2,
            (3, 5):  1, (4, 2):  2, (5, 7):  4, (5, 8):  3,
            (6, 4):  8, (6, 7): 20, (6, 9):  1, (7, 4):  0,
            (7, 9): 20, (8, 3):  6, (8, 7): 10, (8, 9):  2}

#
# Define Sets of Incoming and Outgoing Edges
#
Vm = defaultdict(set)
Vp = defaultdict(set)
for (i, j) in edges:
    Vm[i].add(j)
    Vp[j].add(i)

#
# Create the Model Object
#
model = pe.ConcreteModel()    

#
# Define the Sets
#
model.nodes = pe.Set(initialize=nodes)
model.edges = pe.Set(within=model.nodes*model.nodes, initialize=edges)

#
# Define the Parameters
#
model.Vm = pe.Param(model.nodes, initialize=Vm, default=set(), within=pe.Any)
model.Vp = pe.Param(model.nodes, initialize=Vp, default=set(), within=pe.Any)
model.s = 0
model.t = 9
model.distance = pe.Param(model.edges, initialize=distance)

#
# Define the Variables
#
model.x = pe.Var(model.edges, domain=pe.Reals, bounds=(0, 1))

#
# Define the Objective
#
def shortest_path(model):
    return sum(model.distance[i, j] * model.x[i, j]
               for (i, j) in model.edges)

model.obj_shortest_path = pe.Objective(sense=pe.minimize, rule=shortest_path)

# 
# Define the Constraints
#
def flow_balance(model, i):
    flow_in = sum([model.x[j, i] for j in model.Vp[i]])
    flow_out = sum([model.x[i, j] for j in model.Vm[i]])
    if i == model.s:
        constraint = (flow_out == 1)
    elif i == model.t:
        constraint = (flow_in == 1)
    else:
        constraint = (flow_in == flow_out)
    return constraint

model.con_flow_balance = pe.Constraint(model.nodes, rule=flow_balance)

#
# Solve and Postprocess
#

solver = po.SolverFactory('cbc', executable='/usr/bin/cbc')
result = solver.solve(model, tee=True)

i = int(model.s)
path_nodes = [i]
path_edges = []
stop = False
while not stop:
    for j in model.Vm[i]:
        if model.x[i, j].value == 1:
            if j == int(model.t):
                stop = True
            path_nodes.append(j)
            path_edges.append((i, j))
            i = j
            break

display(path_nodes)

The output is simply:

[0, 2, 5, 8, 9]

Let's see how I would write the same problem in GAMS:

*
* Define the Graph
*
set
   i 'nodes' /node0*node9/
   step 'for reporting' /step1*step10/
;
alias(i,j);

table distance(i,j)
       node0 node1 node2 node3 node4 node5 node6 node7 node8 node9
node0          40     8    10
node1                             6          10
node2           4          12           2
node3                                   1
node4                 2
node5                                               4     3
node6                                              20           1
node7                            eps                           20
node8                       6                      10           2
;

set a(i,j) 'arcs: links for which we have a distance';
a(i,j) = distance(i,j);
display a;

set
   s(i) 'source' /node0/
   t(i) 'sink'   /node9/
;

*
* interpret source,sink as net inflow
*
parameter inflow(i) 'exogenous inflow';
inflow(s) = 1;
inflow(t) = -1;

*
* model
*
binary variable x(i,j) 'flow';
variable totalLength;

equations
   obj          'objective'
   flowBal(i)   'flow balance at node'
;

obj.. totalLength =e= sum(a,x(a)*distance(a));
flowBal(i).. sum(a(j,i), x(j,i)) + inflow(i) =e= sum(a(i,j), x(i,j));

model sp /all/;

*
* solve
*
solve sp minimizing totalLength using rmip;
option x:0;
display x.l, totalLength.l;

*
* collect path
*
set visit(step,i) 'path';
singleton set n(i) 'current node';
n(i) = s(i);
loop(step$card(n),
   visit(step,n) = yes;
   n(j) = x.l(n,j)>0.5;
);
option visit:0:0:1;
display visit;

The output looks like:

----     69 VARIABLE x.L  flow

            node2       node5       node8       node9

node0           1
node2                       1
node5                                   1
node8                                               1


----     69 VARIABLE totalLength.L         =       15.000  

----     82 SET visit  path

step1.node0
step2.node2
step3.node5
step4.node8
step5.node9

It is interesting to observe some of the differences:

The Pyomo model is using numbers as indices. The GAMS model uses strings for this. In GAMS we can use strings representing numbers: '1', '2',... I almost never use set elements like this. I prefer to use meaningful names (say 'Wheat', 'Spain', etc), and if they are not available, prefix the number with what they are ('Node0', etc.). This is a simple measure but can help enormously in readability (especially when we have high-dimensional data).
The GAMS model uses distances to populate the arcs set. There is one complication; zeros are not stored (in GAMS all data is stored sparse). So to make sure the link from node 7 to node 4 is not lost we use an EPS value (this value "exists" but is numerically zero).
I handled the flow balance equation a bit differently. I like simple equations (see also [2]), so I encoded the special handling of the source and sink node into a sparse parameter inflow. With this, we are left with just a regular flow balance equation.
The Pyomo model is an LP model. The GAMS model solves the model as an RMIP: a relaxed MIP model. The variables are declared as binary. A network model like this will produce solutions that are automatically integer valued. In this case that are 0-1 solutions.
The Pyomo model uses more concepts and constructs. The GAMS language is very small (I like to say: if you understand the sum, you understand 90% of the language), but somehow we can express many complex concepts with it. Notable dictionaries are implicit in GAMS (it is set-oriented and all indices are basically a dict-like construct). But the same also means that sometimes you need to use somewhat unnatural code in order to stay inside this small language. Pyomo uses more language constructs and as there is always some Python function to help you out, this can lead to very compact code.
In post-processing, the Pyomo code uses: if model.x[i, j].value == 1. That looks a bit dangerous to me. If the variable is 0.9999999 or 1.000001 it would not be able to construct the correct path. I typically use 0.5 as threshold, so I would write: if model.x[i, j].value > 0.5. That value is a bit ridiculous, but as I don't know a good choice for the tolerance to be used, why not be as conservative as we can (of course in a non-political sense).
The GAMS line n(j) = x.l(n,j)>0.5; is bit special. This line reads as: "Give me the next node on the shortest path". If no such node exists, n will be empty and the loop terminates.

Conclusions

Even small, simple models require careful attention.
When translating a model from a different modeling tool, it is often not a good idea to translate directly line-by-line. The same thing holds when translating between spoken/written languages: translating the words does not lead to proper sentences in the target language. Instead, we should translate concepts.

References

Optimization in Python: Intermediate Pyomo Workshop - Brent Austgen - UT Austin INFORMS, https://www.youtube.com/watch?v=T5LjmbyA1o0
A scheduling problem: a modeling approach, http://yetanothermathprogrammingconsultant.blogspot.com/2021/07/a-scheduling-problem-modeling-approach.html

Yet Another Math Programming Consultant

Sunday, August 8, 2021

A network model: Pyomo vs GAMS

Conclusions

References

1 comment: