Formulating optimization models inside traditional programming languages such as Python is very popular. The main tool the developers use to make this possible is operator overloading. There are cases, where we can write code that looks somewhat reasonable, is accepted and processed without any warning or error messages, but is total nonsense. It is rather difficult with this approach to make things airtight. Especially error handling. In [1], we see a good example. I have created a small fragment here that illustrates the problem.
import pulp # data n_orders = 2 m_rows = 2 cases = [1]*n_orders # Define the model model = pulp.LpProblem("Minimize_Total_Hours", pulp.LpMinimize) # Decision variables: binary variables for assigning each order to a row x = pulp.LpVariable.dicts("x", [(i, j) for i in range(n_orders) for j in range(m_rows)], 0, 1, pulp.LpBinary) # Additional variable: number of orders assigned to each row y = pulp.LpVariable.dicts("y", [j for j in range(m_rows)], 0, n_orders, pulp.LpInteger) # CPH based on the number of orders in a row def get_cph(num_orders): if num_orders == 1: return 152 elif num_orders == 2: return 139 elif num_orders == 3: return 128 elif num_orders == 4: return 119 elif num_orders == 5: return 112 elif num_orders == 6: return 107 else: return 104 # Objective function: Minimize total hours across all rows model += pulp.lpSum( (pulp.lpSum(cases[i] * x[i, j] for i in range(n_orders)) / get_cph(y[j])) for j in range(m_rows) ) print(model) # Solve the model model.solve() # print solve status pulp.LpStatus[model.status]
This code looks superficially like something that is reasonable. If you are a Python programmer but new to PuLP, this is code one could write. But actually, it is 100% total nonsense.
There are many, many issues here.
- PuLP is for linear problems only. So, division by a variable quantity is not allowed.
- We can't use if statements like this. LP and MIP models don't do if statements. They look at the world through a system of linear equations. These if statements need to be rewritten, usually with the help of binary variables. In this case, it looks a bit like a piecewise linear function.
- In PuLP, functions are not callbacks but are called at model generation time. I.e., the solver does not call your functions. The function get_cph is only called once for each j, before the solver starts. I.e., before y[j] has a value. For PuLP, y[j] only has a value after the solve. The use of functions has lots of benefits in programming tasks. However, in PuLP, they can give the illusion that they are called during the solve. Unfortunately, this is a rather common mistake.
- The == inside the function is not a standard comparison operator but an operator hijacked by PuLP. It always returns true in this context (that is a Python thing). So, the first if-condition is always true, independent of the right-hand side. We could have used:
if num_orders == 3.14:
with the same results. You can also verify this by entering: print(bool(y[0]==3.14)). - An expression like y[j]==1 is interpreted as a PuLP constraint. You can see this by entering: print(type(y[0]==1)).
''Optimal
How to confuse PuLP even more
if num_orders == 1:
by
if num_orders == "crazy":
you will see things like:
RecursionError: maximum recursion depth exceeded in comparison
Obviously, this is crazy.
Fixes
- The confusion caused by returning true when evaluating a constraint can be solved by implementing the special __bool__ function and doing something like raising an exception with a proper message (maybe something like "this is PuLP constraint and cannot be converted to a boolean"). By default, Python will consider any object to be true. I don't know what the idea is behind this. But, here, it creates a lot of confusion. At least, my suggestion would be to alert the user that something strange is happening with this code. That is much better than just saying nothing. Good code not only works flawlessly for perfect input, but also gives good feedback when the input does not make sense.
- The infinite recursion is just a PuLP bug. That should never happen. Luckily it is easily debugged by inspecting the stack trace. The culprit is function addInPlace of class LpAffineExpression. There is an obvious problem there when strings are involved. We can blame this on sloppy programming.
Conclusion
References
- Pulp Constraint Problem with [i,j] integers not giving the best Solution, https://or.stackexchange.com/questions/12615/pulp-constraint-problem-with-i-j-integers-not-giving-the-best-solution
- Modeling surprises, https://yetanothermathprogrammingconsultant.blogspot.com/2024/05/modeling-surprises.html
- PuLP Mystery, https://yetanothermathprogrammingconsultant.blogspot.com/2020/10/pulp-mystery.html
No comments:
Post a Comment