The simple shortest path problem is reproduced here:
The output is simply:
The output looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | from collections import defaultdict import pyomo.environ as pe import pyomo.opt as po # # Define the Graph # nodes = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} edges = {(0, 1), (0, 2), (0, 3), (1, 4), (1, 6), (2, 1), (2, 3), (2, 5), (3, 5), (4, 2), (5, 7), (5, 8), (6, 4), (6, 7), (6, 9), (7, 4), (7, 9), (8, 3), (8, 7), (8, 9)} distance = {(0, 1): 40, (0, 2): 8, (0, 3): 10, (1, 4): 6, (1, 6): 10, (2, 1): 4, (2, 3): 12, (2, 5): 2, (3, 5): 1, (4, 2): 2, (5, 7): 4, (5, 8): 3, (6, 4): 8, (6, 7): 20, (6, 9): 1, (7, 4): 0, (7, 9): 20, (8, 3): 6, (8, 7): 10, (8, 9): 2} # # Define Sets of Incoming and Outgoing Edges # Vm = defaultdict(set) Vp = defaultdict(set) for (i, j) in edges: Vm[i].add(j) Vp[j].add(i) # # Create the Model Object # model = pe.ConcreteModel() # # Define the Sets # model.nodes = pe.Set(initialize=nodes) model.edges = pe.Set(within=model.nodes*model.nodes, initialize=edges) # # Define the Parameters # model.Vm = pe.Param(model.nodes, initialize=Vm, default=set(), within=pe.Any) model.Vp = pe.Param(model.nodes, initialize=Vp, default=set(), within=pe.Any) model.s = 0 model.t = 9 model.distance = pe.Param(model.edges, initialize=distance) # # Define the Variables # model.x = pe.Var(model.edges, domain=pe.Reals, bounds=(0, 1)) # # Define the Objective # def shortest_path(model): return sum(model.distance[i, j] * model.x[i, j] for (i, j) in model.edges) model.obj_shortest_path = pe.Objective(sense=pe.minimize, rule=shortest_path) # # Define the Constraints # def flow_balance(model, i): flow_in = sum([model.x[j, i] for j in model.Vp[i]]) flow_out = sum([model.x[i, j] for j in model.Vm[i]]) if i == model.s: constraint = (flow_out == 1) elif i == model.t: constraint = (flow_in == 1) else: constraint = (flow_in == flow_out) return constraint model.con_flow_balance = pe.Constraint(model.nodes, rule=flow_balance) # # Solve and Postprocess # solver = po.SolverFactory('cbc', executable='/usr/bin/cbc') result = solver.solve(model, tee=True) i = int(model.s) path_nodes = [i] path_edges = [] stop = False while not stop: for j in model.Vm[i]: if model.x[i, j].value == 1: if j == int(model.t): stop = True path_nodes.append(j) path_edges.append((i, j)) i = j break display(path_nodes) |
[0, 2, 5, 8, 9]
Let's see how I would write the same problem in GAMS:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | * * Define the Graph * set i 'nodes' /node0*node9/ step 'for reporting' /step1*step10/ ; alias(i,j); table distance(i,j) node0 node1 node2 node3 node4 node5 node6 node7 node8 node9 node0 40 8 10 node1 6 10 node2 4 12 2 node3 1 node4 2 node5 4 3 node6 20 1 node7 eps 20 node8 6 10 2 ; set a(i,j) 'arcs: links for which we have a distance'; a(i,j) = distance(i,j); display a; set s(i) 'source' /node0/ t(i) 'sink' /node9/ ; * * interpret source,sink as net inflow * parameter inflow(i) 'exogenous inflow'; inflow(s) = 1; inflow(t) = -1; * * model * binary variable x(i,j) 'flow'; variable totalLength; equations obj 'objective' flowBal(i) 'flow balance at node' ; obj.. totalLength =e= sum(a,x(a)*distance(a)); flowBal(i).. sum(a(j,i), x(j,i)) + inflow(i) =e= sum(a(i,j), x(i,j)); model sp /all/; * * solve * solve sp minimizing totalLength using rmip; option x:0; display x.l, totalLength.l; * * collect path * set visit(step,i) 'path'; singleton set n(i) 'current node'; n(i) = s(i); loop(step$card(n), visit(step,n) = yes; n(j) = x.l(n,j)>0.5; ); option visit:0:0:1; display visit; |
The output looks like:
---- 69 VARIABLE x.L flow node2 node5 node8 node9 node0 1 node2 1 node5 1 node8 1 ---- 69 VARIABLE totalLength.L = 15.000 ---- 82 SET visit path step1.node0 step2.node2 step3.node5 step4.node8 step5.node9
It is interesting to observe some of the differences:
- The Pyomo model is using numbers as indices. The GAMS model uses strings for this. In GAMS we can use strings representing numbers: '1', '2',... I almost never use set elements like this. I prefer to use meaningful names (say 'Wheat', 'Spain', etc), and if they are not available, prefix the number with what they are ('Node0', etc.). This is a simple measure but can help enormously in readability (especially when we have high-dimensional data).
- The GAMS model uses distances to populate the arcs set. There is one complication; zeros are not stored (in GAMS all data is stored sparse). So to make sure the link from node 7 to node 4 is not lost we use an EPS value (this value "exists" but is numerically zero).
- I handled the flow balance equation a bit differently. I like simple equations (see also [2]), so I encoded the special handling of the source and sink node into a sparse parameter inflow. With this, we are left with just a regular flow balance equation.
- The Pyomo model is an LP model. The GAMS model solves the model as an RMIP: a relaxed MIP model. The variables are declared as binary. A network model like this will produce solutions that are automatically integer valued. In this case that are 0-1 solutions.
- The Pyomo model uses more concepts and constructs. The GAMS language is very small (I like to say: if you understand the sum, you understand 90% of the language), but somehow we can express many complex concepts with it. Notable dictionaries are implicit in GAMS (it is set-oriented and all indices are basically a dict-like construct). But the same also means that sometimes you need to use somewhat unnatural code in order to stay inside this small language. Pyomo uses more language constructs and as there is always some Python function to help you out, this can lead to very compact code.
- In post-processing, the Pyomo code uses: if model.x[i, j].value == 1. That looks a bit dangerous to me. If the variable is 0.9999999 or 1.000001 it would not be able to construct the correct path. I typically use 0.5 as threshold, so I would write: if model.x[i, j].value > 0.5. That value is a bit ridiculous, but as I don't know a good choice for the tolerance to be used, why not be as conservative as we can (of course in a non-political sense).
- The GAMS line n(j) = x.l(n,j)>0.5; is bit special. This line reads as: "Give me the next node on the shortest path". If no such node exists, n will be empty and the loop terminates.
Conclusions
- Even small, simple models require careful attention.
- When translating a model from a different modeling tool, it is often not a good idea to translate directly line-by-line. The same thing holds when translating between spoken/written languages: translating the words does not lead to proper sentences in the target language. Instead, we should translate concepts.
References
- Optimization in Python: Intermediate Pyomo Workshop - Brent Austgen - UT Austin INFORMS, https://www.youtube.com/watch?v=T5LjmbyA1o0
- A scheduling problem: a modeling approach, http://yetanothermathprogrammingconsultant.blogspot.com/2021/07/a-scheduling-problem-modeling-approach.html
GAMS wins
ReplyDelete