## Sunday, October 18, 2020

### PuLP mystery

PuLP is a popular Python-based modeling tool for LP and MIP models. In , a user asked a (somewhat trivial) question about PuLP syntax. But there was an interesting wrinkle in the model that was unfamiliar to me. The basic issue is that sometimes PuLP accepts a NumPy array in expressions:

import pulp as p
import numpy as np

a=np.array([1,2,3])

x=p.LpVariable('x')
y=p.LpVariable('y')

prob = p.LpProblem("test",p.LpMaximize)
prob += x+(a-y)
prob


This fragment creates an objective. But we have a strange term here. a-y is funny because a is a NumPy array (vector) and y is a scalar PuLP variable. Usually, adding a scalar to a vector is interpreted as elementwise addition: $\begin{pmatrix} a_0 - y \\ a_1 - y \\ a_2-y \end{pmatrix}$ How one would interpret an objective like: $x + \begin{pmatrix} a_0 - y \\ a_1 - y \\ a_2-y \end{pmatrix}$ is not clear to me. I would say this is an error. However, PuLP accepts this, and prints the model as:

test:
MAXIMIZE
1*x + -3*y + 6
VARIABLES
x free Continuous
y free Continuous


So, apparently the interpretation is: $x + \sum_i (a_i-y)=x + \sum_i a_i - n\cdot y$ where $$n$$ is the length of vector $$a$$. But then again, we would expect PuLP to accept

  prob += (a-y)

as objective. However, this produces the error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-bf3a3a41e9da> in <module>()
8
9 prob = p.LpProblem("test",p.LpMaximize)
---> 10 prob += (a-y)
11 prob

1528             self.objective.name = name
1529         else:
-> 1530             raise TypeError("Can only add LpConstraintVar, LpConstraint, LpAffineExpression or True objects")
1531         return self
1532

TypeError: Can only add LpConstraintVar, LpConstraint, LpAffineExpression or True objects


I have no idea what is going on here. It could be all just a bug, but even that can not easily explain the behavior of sometimes accepting (a-y) and sometimes refusing it.

The underlying problem can actually be traced back to operator overloading as shown by . Who owns the + or -? The following experiment demonstrates the issue in stark terms:

>>> a+x
array([1*x + 1, 1*x + 2, 1*x + 3], dtype=object)
>>> x+a
1*x + 6


In the first case, NumPy handles the +, leading to the interpretation: $\begin{pmatrix} a_0 \\ a_1 \\ a_2 \end{pmatrix} + x = \begin{pmatrix} x+a_0 \\ x+a_1 \\x+a_2 \end{pmatrix}$ In the second case, the + is dealt with by PuLP. This gives $x+\begin{pmatrix} a_0 \\ a_1 \\ a_2 \end{pmatrix}=x+\sum_i a_i$

#### Conclusions

This is mind-bending (a.k.a. crazy). It would be better if things were a bit more predictable ("orthogonal" is the term used in programming language design). Preferably, we don't want to spend time on figuring out how to interpret a + or a -

Advise: it is better not to mix PuLP with NumPy.

#### References

1. 1. 