Yet Another Math Programming Consultant: Python

Showing posts with label Python. Show all posts

Saturday, October 26, 2024

PuLP surprises

Formulating optimization models inside traditional programming languages such as Python is very popular. The main tool the developers use to make this possible is operator overloading. There are cases, where we can write code that looks somewhat reasonable, is accepted and processed without any warning or error messages, but is total nonsense. It is rather difficult with this approach to make things airtight. Especially error handling. In [1], we see a good example. I have created a small fragment here that illustrates the problem.

Modeling surprises

Here is an example where the PuLP modeling tool goes berserk.

In standard linear programming, only $\ge$, $=$ and $\le$ constraints are supported. Some tools also allow $\ne$, which for MIP models needs to be reformulated into a disjunctive constraint. Here is an attempt to do this in PuLP [1]. PuLP does not support this relational operator in its constraints, so we would expect a meaningful error message.

Small non-convex MINLP: Pyomo vs GAMS

In [1], the following Pyomo model (Python fragment) is presented:

model.x = Var(name="Number of batches", domain=NonNegativeIntegers, initialize=10)                    
model.a = Var(name="Batch Size", domain=NonNegativeIntegers, bounds=(5,20))

# Objective function
def total_production(model):
    return model.x * model.a
model.total_production = Objective(rule=total_production, sense=minimize)

# Constraints
# Minimum production of the two output products
def first_material_constraint_rule(model):
    return sum(0.2 * model.a * i for i in range(1, value(model.x)+1)) >= 70
model.first_material_constraint = Constraint(rule=first_material_constraint_rule)

def second_material_constraint_rule(model):
    return sum(0.8 * model.a * i for i in range(1, value(model.x)+1)) >= 90
model.second_material_constraint = Constraint(rule=second_material_constraint_rule)

# At least one production run
def min_production_rule(model):
    return model.x >= 1
model.min_production = Constraint(rule=min_production_rule)

Julia vs Python

I gave a talk to economists (i.e., not professional programmers) about a Julia project we were working on. Julia is famous for its speed. It uses LLVM [1] as back-end for its JIT (Just In Time) compilation. As seeing is believing, here is an example algorithm which is used to compare performance between different languages and tools. This example was chosen as it is small, easy to explain, and easy to program while still showing meaningful time differences.

We have a square $[-1,+1]\times[-1,+1]$ and an inscribing circle with radius $1$. See the dynamic figure below. Their areas are $4$ and $\pi$ respectively. The idea is to draw $n$ points \[\begin{align}& x_i \sim U(-1,+1) \\ & y_i \sim U(-1,+1)\end{align}\]Let $m$ be the number of points inside the circle, i.e. with \[x_i^2+y_i^2\lt 1\] Obviously, from the ratio of the areas, we have \[\frac{m}{n} \approx \frac{\pi}{4}\] It follows that an estimate of $\pi$ is \[\hat{\pi}=4\frac{m}{n}\]

Simulation with n=1000

In-process, in-memory databases

There are a few database systems that are a bit different. They are libraries that can be linked directly to your application. Linking can be done statically (during the compilation/linking step) or dynamically (using a shared library or DLL). Here I want to show two cases:

SQLite [1] used from R on data frames
DuckDB [2] used from Python, again on data frames

So these databases don't only run inside R or Python but also can operate directly on data frames.

Ranking using numpy.argsort

I needed to find a ranking of a large data set. Using Python, it makes sense to look at the numpy library for this.

Numpy has the function argsort, which returns index positions [1]. One would think these are exactly the ranks we are after. Unfortunately, this is not the case.

>>> import numpy as np
>>> a = [3.0, 1.0, 5.0, 2.0]
>>> indx = np.argsort(a)
>>> indx
array([1, 3, 0, 2], dtype=int64)

matrix operations via GAMS Embedded Python: multi-regional Input-Output tables

Wassily Leontief in front of an IO table, credit:NYU

Inverting a dense matrix

GAMS allows running pieces of Python code as part of a GAMS model. Unfortunately, the interface is rather low-level and inefficient. Usually "low-level" is associated with high-performance, but in the Python world, this is not the case. Here is an example.

A network model: Pyomo vs GAMS

Network models are an important class of optimization models, in itself but also as part of larger models. In [1] a nice presentation is given on how to set up a network model in the Python-based modeling tool Pyomo.

The simple shortest path problem is reproduced here:

Eight soldiers lining up: a very large permutation problem

In [1] an intriguing problem is posted:

There are 8 soldiers, gathering and lining up every morning for their military service. The commander at the head of these soldiers demands that the morning lineup of these soldiers be arranged differently for every next day according to the following rule:
Any three soldiers cannot be lined up next to each other in the same order for others days.
For example; If ABCDEFGH is the first arrangement for day 1, on the other days, ABC,BCD, CDE, DEF, EFG and FGH cannot be lined up next to each other in the same order any more, but ACB arrangement is okay for other days until used once since they are not in the same order.
What is the maximum number of days can this happen?

Let's first make the problem a bit smaller. Say we have just 4 soldiers. This gives us $4!=24$ permutations or line-ups. Let's have a look at the following sets:

$p$ is the set of permutations
$t$ is the set of substrings of length 3 that we can form from $p$
$\color{darkblue}pt(p,t)$ is the mapping between $p$ and $t$

----     53 SET p  permutations or line ups

ABCD,    ABDC,    ACBD,    ACDB,    ADBC,    ADCB,    BACD,    BADC,    BCAD,    BCDA,    BDAC,    BDCA,    CABD
CADB,    CBAD,    CBDA,    CDAB,    CDBA,    DABC,    DACB,    DBAC,    DBCA,    DCAB,    DCBA


----     53 SET t  substrings of length 3

ABC,    BCD,    ABD,    BDC,    ACB,    CBD,    ACD,    CDB,    ADB,    DBC,    ADC,    DCB,    BAC,    BAD,    BCA
CAD,    CDA,    BDA,    DAC,    DCA,    CAB,    CBA,    DAB,    DBA


----     55 SET pt  mapping (computed in Python)

ABCD.ABC,    ABCD.BCD,    ABDC.ABD,    ABDC.BDC,    ACBD.ACB,    ACBD.CBD,    ACDB.ACD
ACDB.CDB,    ADBC.ADB,    ADBC.DBC,    ADCB.ADC,    ADCB.DCB,    BACD.ACD,    BACD.BAC
BADC.ADC,    BADC.BAD,    BCAD.BCA,    BCAD.CAD,    BCDA.BCD,    BCDA.CDA,    BDAC.BDA
BDAC.DAC,    BDCA.BDC,    BDCA.DCA,    CABD.ABD,    CABD.CAB,    CADB.ADB,    CADB.CAD
CBAD.BAD,    CBAD.CBA,    CBDA.CBD,    CBDA.BDA,    CDAB.CDA,    CDAB.DAB,    CDBA.CDB
CDBA.DBA,    DABC.ABC,    DABC.DAB,    DACB.ACB,    DACB.DAC,    DBAC.BAC,    DBAC.DBA
DBCA.DBC,    DBCA.BCA,    DCAB.DCA,    DCAB.CAB,    DCBA.DCB,    DCBA.CBA


----     66 PARAMETER np                   =           24  card(p)
            PARAMETER nt                   =           24  card(t)
            PARAMETER npt                  =           48  card(pt)

In [1], Rob Pratt proposes a MIP model for this:

Model A
\[\begin{align}\max\> & \color{darkred}z = \sum_p \color{darkred}x_p \\ &\sum_{p\|\color{darkblue}{pt}(p,t)} \color{darkred}x_p \le 1&& \forall t \\ & \color{darkred}x_p \in \{0,1\}\end{align} \]

PuLP mystery

PuLP is a popular Python-based modeling tool for LP and MIP models. In [1], a user asked a (somewhat trivial) question about PuLP syntax. But there was an interesting wrinkle in the model that was unfamiliar to me. The basic issue is that sometimes PuLP accepts a NumPy array in expressions:

import pulp as p
import numpy as np

a=np.array([1,2,3])

x=p.LpVariable('x')
y=p.LpVariable('y')

prob = p.LpProblem("test",p.LpMaximize)
prob += x+(a-y)
prob

This fragment creates an objective. But we have a strange term here. a-y is funny because a is a NumPy array (vector) and y is a scalar PuLP variable. Usually, adding a scalar to a vector is interpreted as elementwise addition: \[\begin{pmatrix} a_0 - y \\ a_1 - y \\ a_2-y \end{pmatrix}\] How one would interpret an objective like: \[ x + \begin{pmatrix} a_0 - y \\ a_1 - y \\ a_2-y \end{pmatrix}\] is not clear to me. I would say this is an error. However, PuLP accepts this, and prints the model as:

test:
MAXIMIZE
1*x + -3*y + 6
VARIABLES
x free Continuous
y free Continuous

So, apparently the interpretation is: \[ x + \sum_i (a_i-y)=x + \sum_i a_i - n\cdot y\] where $n$ is the length of vector $a$. But then again, we would expect PuLP to accept

  prob += (a-y)

as objective. However, this produces the error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-bf3a3a41e9da> in <module>()
      8 
      9 prob = p.LpProblem("test",p.LpMaximize)
---> 10 prob += (a-y)
     11 prob

/usr/local/lib/python3.6/dist-packages/pulp/pulp.py in __iadd__(self, other)
   1528             self.objective.name = name
   1529         else:
-> 1530             raise TypeError("Can only add LpConstraintVar, LpConstraint, LpAffineExpression or True objects")
   1531         return self
   1532 

TypeError: Can only add LpConstraintVar, LpConstraint, LpAffineExpression or True objects

I have no idea what is going on here. It could be all just a bug, but even that can not easily explain the behavior of sometimes accepting (a-y) and sometimes refusing it.

The underlying problem can actually be traced back to operator overloading as shown by [2]. Who owns the + or -? The following experiment demonstrates the issue in stark terms:

>>> a+x
array([1*x + 1, 1*x + 2, 1*x + 3], dtype=object)
>>> x+a
1*x + 6

In the first case, NumPy handles the +, leading to the interpretation: \[\begin{pmatrix} a_0 \\ a_1 \\ a_2 \end{pmatrix} + x = \begin{pmatrix} x+a_0 \\ x+a_1 \\x+a_2 \end{pmatrix}\] In the second case, the + is dealt with by PuLP. This gives \[x+\begin{pmatrix} a_0 \\ a_1 \\ a_2 \end{pmatrix}=x+\sum_i a_i\]

Conclusions

This is mind-bending (a.k.a. crazy). It would be better if things were a bit more predictable ("orthogonal" is the term used in programming language design). Preferably, we don't want to spend time on figuring out how to interpret a + or a -.

Advise: it is better not to mix PuLP with NumPy.

References

'LpAffineExpression' object has no attribute 'solve' in pulp Lp problem- Optimization, https://stackoverflow.com/questions/64405352/lpaffineexpression-object-has-no-attribute-solve-in-pulp-lp-problem-optimiz
Ayhan Fuat Çelik, PuLP and NumPy, https://gist.github.com/ayhanfuat/5c72f6cec33fa6d50ae238d8f4345f44
The picture is from https://pulpcovers.com/tag/thrillingmystery/

Thursday, April 23, 2020

Random Sparse Arcs in GAMS

I was playing a bit with generating a large network (graph) with $n=5000$ nodes and say \[\frac{n^2}{100}= 250,000\] arcs. The standard way to generate this in GAMS is:

*
* nodes
*
set i 'nodes' /node1*node5000/;
alias(i,j);

*
* random arcs (1%)
*
set A(i,j) 'arcs';
A(i,j) = uniform(0,1)<0.01;

scalar numArcs 'number of arcs';
numArcs = card(A);
display numArcs;

Basically, this approach generates $n^2$ random numbers $ r_{i,j} \sim U(0,1)$ (uniform distribution) and picks the arcs related to a value $r_{i,j}\lt 0.01$. This is a randomized process, so we don't see exactly 250,000 arcs, but rather:

----     16 PARAMETER numArcs              =   249170.000  number of arcs

This operation is not exactly cheap. It takes 11.2 seconds on my laptop.

There are a few other approaches we can try.

Random locations

Instead of generating $n^2$ random numbers, we can generate $k=250,000$ pairs of $i,j$ values (integers between 1 and $n$). A straigthforward implementation would be:

scalars k,n,nn,ni,nj;
n = card(i);
nn = n*n/100;
for (k = 1 to nn,
   ni = uniformint(1,n);
   nj = uniformint(1,n);
   A(i,j)$(ord(i)=ni and ord(j)=nj) = yes;
);

Unfortunately, this is extremely slow. I stopped it after 2,000 seconds: it was still not finished. GAMS is horribly slow when doing loops like this.

Crazy indexing

A variant of the above approach is to try to trick GAMS into a vectorized version of this. Well, this is not so easy. Here is a version that uses leads/lags when indexing:

sets
   A(*,*) 'arcs'
   k /node1*node5000,k5001*k250000/
   ij /i,j/
;
parameter r(k,ij) 'random offset from k';
r(k,ij) = uniformint(1,card(i)) - ord(k);
A(k+r(k,'i'),k+r(k,'j')) = yes;

This requires some explanation.

The set A is no longer domain-checked. We use funky indexing here, so we need loosen things up.
The set k has 250,000 elements. The first 5000 are identical to $i,j$.
The parameter r contains $2\times 250,000$ random integers between 1 and $n$. We store them as offsets from the index $k$. This sounds crazy but the next statement explains why this is done.
The assignment to A is using leads. Think of it as a variant of A(k,k) = yes. This would populate a diagonal. Similarly, A(k+1,k) = yes shifts the diagonal by one. Using A(k+r(k,'i'),k+r(k,'j')) we index exactly the correct location.
We may generate duplicates, so the number will be slightly below 250k arcs.

This beast actually works, but the performance is not great. The fragment takes about 13 seconds, so no gain compared to our original approach.

Python loop

Let's try some Python. We can build up sets using Python within a GAMS model:

set A(i,j) 'arcs';

$onEmbeddedCode Python:

from random import seed,randint
seed(999)
n = len(list(gams.get("i")))
nn = n*n//100
a = {}
for k in range(nn):
ni = randint(1,n)
nj = randint(1,n)
elem = ("node{}".format(ni),"node{}".format(nj))
a[elem] = 1
gams.set("A",list(a))

$offEmbeddedCode A

This is just a k-loop. We use here a Python dictionary to store elements as this handles possible duplicates correctly. This approach also takes about 11 seconds, so no gain compared to our first approach.

R to the rescue

So let's see if we can use R to speed things up:

$set n    5000
$set inc data.inc

*
* R script
*
$onecho > script.R
library(data.table)
n <- %n%
nn <- n^2/100
df <- data.frame(
         ni=sample(n,nn,replace=TRUE),
         nj=sample(n,nn,replace=TRUE))
df <- df[order(df$ni,df$nj),]
v <- unique(paste0("node",df$ni,".node",df$nj))
fwrite(list(v),"%inc%",col.names=F,quote=F)
$offecho
$call '"c:\program files\R\R-3.5.0\bin\Rscript.exe" script.R'

*
* nodes and arcs
*
set i 'nodes' /node1*node%n%/;
alias(i,j);

set A(i,j) 'arcs' /
$offlisting
$include %inc%
$onlisting
/;

We write from inside GAMS an R script that gets executed by rscript.exe. The steps are:

sample the arcs,
sort in the order that GAMS likes,
create a string representation of the tuple, e.g. "node1.node2",
remove duplicates (this means we end up with a number of arcs that is smaller than 250k),
write to an include file (note: the function fwrite from the data.table package seems a bit faster than the function writeLines from the base package)

The include file is then processed by GAMS. To speed things up we remove echoing to the listing file.

This method is the winner: it takes just 3.5 seconds. Sometimes using a plain old text file as a communication channel is not so bad.

GDX file is slower

The code:

$set n 5000
$set gdx data.gdx

*
* R script
*
$onecho > script.R
library(gdxrrw)
library(dplyr)
n <- 5000
nn <- n^2/100
df <- data.frame(
ni=paste0("node",sample(n,nn,replace=TRUE)),
nj=paste0("node",sample(n,nn,replace=TRUE)),
stringsAsFactors = F)
df <- distinct(df,ni,nj)
nr <- nrow(df)
wgdx("%gdx%",list(name="A", type="set", dim=2, form="sparse", ts="arcs", val=cbind(1:nr,1:nr), uels=c(list(df$ni),list(df$nj))))
$offecho
$call '"c:\program files\R\R-3.5.0\bin\Rscript.exe" script.R'

*
* nodes and arcs
*
set i 'nodes' /node1*node%n%/;
alias(i,j);

set A(i,j) 'arcs';
$gdxin %gdx%
$loaddc A

is slower than using a text file. The reason is that wgdx is not as efficient.

Saturday, January 4, 2020

Small Blending Problem in PuLP

In [1] a small blending problem is proposed.

There are different raw steel materials that have to be mixed (blended) into a final material that has certain specifications. In this case the specifications are limits on the elements Carbon (C), Copper (Cu) and Manganese (Mn). We assume things blend linearly.

Blending is a traditional linear programming application, and models are found in many text books.

The problem is small so let's try PuLP here.

I'll try to write a bit about indexing using strings (instead of integers), and compare PuLP with CVXPY and GAMS. As the model is small, I'll also add some data manipulation (using data frames) and some simple reporting (also using data frames). I use string indexing in Pulp (with GAMS this is standard). Of course CVXPY is very different: it is matrix based. So there things are position dependent. The idea it to extract data from the data frame in such a way that positions are predictable.

Problem data

The data for the problem is as follows:

Demand: 5000 Kg

Specification of final material:

    Element      %Minimum %Max   
    Carbon       2         3     
    Copper       0.4       0.6   
    Manganese    1.2       1.65  


Raw material inventory:

Alloy          C%   Cu%   Mn%     Stocks kg Price € / kg
Iron alloy     2.50 0.00  1.30    4000      1.20
Iron alloy     3.00 0.00  0.80    3000      1.50
Iron alloy     0.00 0.30  0.00    6000      0.90
Copper alloy   0.00 90.00 0.00    5000      1.30
Copper alloy   0.00 96.00 4.00    2000      1.45
Aluminum alloy 0.00 0.40  1.20    3000      1.20
Aluminum alloy 0.00 0.60  0.00   2,500      1.00

Mathematical Model

The basic model is:

Blending Model
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i\cdot \color{darkred}{\mathit{Use}}_i\\ & \color{darkblue}{\mathit{Min}}_j \le \frac{\displaystyle\sum_i \color{darkblue}{\mathit{Element}}_{i,j}\cdot \color{darkred}{\mathit{Use}}_i}{\displaystyle \sum_i \color{darkred}{\mathit{Use}}_i} \le \color{darkblue}{\mathit{Max}}_j \\ & \sum_i \color{darkred}{\mathit{Use}}_i = \color{darkblue}{\mathit{Demand}} \\ & 0 \le \color{darkred}{\mathit{Use}}_i \le \color{darkblue}{\mathit{Available}}_i \end{align} \]

Blending Model

\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i\cdot \color{darkred}{\mathit{Use}}_i\\ & \color{darkblue}{\mathit{Min}}_j \le \frac{\displaystyle\sum_i \color{darkblue}{\mathit{Element}}_{i,j}\cdot \color{darkred}{\mathit{Use}}_i}{\displaystyle \sum_i \color{darkred}{\mathit{Use}}_i} \le \color{darkblue}{\mathit{Max}}_j \\ & \sum_i \color{darkred}{\mathit{Use}}_i = \color{darkblue}{\mathit{Demand}} \\ & 0 \le \color{darkred}{\mathit{Use}}_i \le \color{darkblue}{\mathit{Available}}_i \end{align} \]

Here we use: \[\begin{cases} i & \text{Types of raw material in stock}\\ j & \text{Element with limits}\\ {\mathit{Cost}}_i & \text{Unit cost of raw material} \\ {\mathit{Min}}_j, {\mathit{Max}}_j & \text{Limits on element content in final product} \\ {\mathit{Element}}_{i,j} & \text{Content of elements in raw material}\\ {\mathit{Demand}} & \text{Demand for final product}\\ {\mathit{Available}}_i & \text{Availability of raw material} \\ {\mathit{Use}}_i & \text{Decision variable: how much raw material to use} \end{cases}\] The blending constraint is nonlinear: we divide by the total weight of the final product to calculate the percentages. We can linearize this fraction in two ways:

multiply all sides by $\sum_i \mathit{Use}_i$. This leads to \[\mathit{Min}_j \cdot\sum_i \mathit{Use}_i \le \sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \le \mathit{Max}_j \sum_i \mathit{Use}_i\] This first reformulation is especially useful when the total final product is not constant. Note that this formulation is sometimes difficult to recognize when rewritten as something like: \[\begin{cases} \displaystyle \sum_i (\mathit{Element}_{i,j}-\mathit{Min}_j )\mathit{Use}_i \ge 0 \>\>\forall j \\ \displaystyle \sum_i (\mathit{Max}_j - \mathit{Element}_{i,j})\mathit{Use}_i \ge 0 \>\>\forall j \end{cases}\]
In our case we know that $\sum_i \mathit{Use}_i$ is constant: it is always equal to $\mathit{Demand}$. I.e. \[ \mathit{Min}_j \le \frac{1}{\mathit{Demand}} \sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \le \mathit{Max}_j\]

We often need to split sandwich equations into two simple inequalities. That often leads to duplicate expressions: \[\begin{cases} \displaystyle \frac{1}{\mathit{Demand}}\sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \ge \mathit{Min}_j \>\>\forall j \\ \displaystyle \frac{1}{\mathit{Demand}}\sum_i \mathit{Element}_{i,j} \cdot\mathit{Use}_i \le \mathit{Max}_j \>\>\forall j \end{cases}\] For small problems this is not an issue. For larger problems, I prefer to introduce extra variables that prevents these duplicate expressions. We end up with the following linear programming formulation:

Linear Programming Formulation
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot\color{darkred}{\mathit{Use}}_i\\ & \color{darkblue}{\mathit{Demand}} \cdot \color{darkred}{\mathit{Content}}_j = \sum_i \color{darkblue}{\mathit{Element}}_{i,j} \cdot\color{darkred}{\mathit{Use}}_i \\ & \sum_i \color{darkred}{\mathit{Use}}_i = \color{darkblue}{\mathit{Demand}} \\ & \color{darkred}{\mathit{Use}}_i \in [0, \color{darkblue}{\mathit{Available}}_i]\\ & \color{darkred}{\mathit{Content}}_j \in [\color{darkblue}{\mathit{Min}}_j,\color{darkblue}{\mathit{Max}}_j] \end{align} \]

Linear Programming Formulation

\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot\color{darkred}{\mathit{Use}}_i\\ & \color{darkblue}{\mathit{Demand}} \cdot \color{darkred}{\mathit{Content}}_j = \sum_i \color{darkblue}{\mathit{Element}}_{i,j} \cdot\color{darkred}{\mathit{Use}}_i \\ & \sum_i \color{darkred}{\mathit{Use}}_i = \color{darkblue}{\mathit{Demand}} \\ & \color{darkred}{\mathit{Use}}_i \in [0, \color{darkblue}{\mathit{Available}}_i]\\ & \color{darkred}{\mathit{Content}}_j \in [\color{darkblue}{\mathit{Min}}_j,\color{darkblue}{\mathit{Max}}_j] \end{align} \]

Even for small, almost trivial models, it makes sense to first develop a mathematical model. Especially, if you are not very experienced in developing linear programming models. Starting with a pen and a piece of paper is sometimes better than immediately start coding.

Implementation in Python/Pulp

An implementation using PuLP can look like:

from io import StringIO
import pandas as pd
import pulp as lp

# for inputting tabular data below
def table(s):
  return pd.read_csv(StringIO(s),sep='\s+',index_col='ID')

#------------------------------------------------------------------
# data
#------------------------------------------------------------------

demand = 5000

requirements = table("""
   ID  Element      Min   Max
   C   Carbon       2     3
   Cu  Copper       0.4   0.6
   Mn  Manganese    1.2   1.65
    """)

supplyData = table("""
  ID  Alloy             C       Cu     Mn     Stock   Price
  A   "Iron alloy"      2.50    0.00   1.30   4000    1.20
  B   "Iron alloy"      3.00    0.00   0.80   3000    1.50
  C   "Iron alloy"      0.00    0.30   0.00   6000    0.90
  D   "Copper alloy"    0.00   90.00   0.00   5000    1.30
  E   "Copper alloy"    0.00   96.00   4.00   2000    1.45
  F   "Aluminum alloy"  0.00    0.40   1.20   3000    1.20
  G   "Aluminum alloy"  0.00    0.60   0.00   2500    1.00
  """)

print("----- Data-------")
print(requirements)
print(supplyData)


#------------------------------------------------------------------
# derived data
#------------------------------------------------------------------

# our sets are stockItems ["A","B",..] and elements ["C","Cu",...] 
Items = supplyData.index
Elements = requirements.index

print("----- Indices-------")
print(Items)
print(Elements)

#------------------------------------------------------------------
# LP Model
#------------------------------------------------------------------


use = lp.LpVariable.dicts("Use",Items,0,None,cat='Continuous')
content = lp.LpVariable.dicts("Content",Elements,0,None,cat='Continuous')

model = lp.LpProblem("Steel", lp.LpMinimize)

# objective : minimize cost
model += lp.lpSum([use[i]*supplyData.loc[i,'Price'] for i in Items ])

# upper bounds wrt availability
for i in Items:
  model += use[i] <= supplyData.loc[i,'Stock']

# final content of elements and their bounds  
for j in Elements:
  model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])
  model += content[j] >= requirements.loc[j,'Min']
  model += content[j] <= requirements.loc[j,'Max']

# meet demand
model += lp.lpSum([use[i] for i in Items]) == demand


# for debugging
#print(model)


#------------------------------------------------------------------
# Solve and reporting
#------------------------------------------------------------------

model.solve()


print("----- Model Results-------")
print("Status:", lp.LpStatus[model.status])
print("Objective:",lp.value(model.objective))


# collect results
L = []
for i in Items: 
  L.append(['use',i,0.0,use[i].varValue,supplyData.loc[i,'Stock']])
for j in Elements:
  L.append(['content',j,requirements.loc[j,'Min'],content[j].varValue,requirements.loc[j,'Max']])
results = pd.DataFrame(L,columns=['Variable','Index','Lower','Value','Upper'])
print(results)

Notes:

We input the basic data as data frames. Data frames are a standard way to handle tabular data. Data frames are originally from the R statistical software system.
Usually read_csv is for CSV files. Here we use it to read from a string. Blanks are used as separator to make the table more readable for humans.
For each data frame we added an index column. This index will allow us to select a row from the data frame. Note that the index is a string. In general using strings as index is safer than using an index number. We see much earlier that things are wrong when making a mistake like using $j$ (element) instead of $i$ (raw material).
Python Pandas allows duplicate indices. We can check for this using duplicated() function.
Because we access the data by name, it would not matter if the rows or columns are in a different position. This is more like a database table, where we assume no particular ordering.
We also use a data frame for reporting. Data frames are printed in a nicer way than Python arrays, and they can be exported to CVS files or spreadsheet with one function call.
The variables are also indexed by names. This is accomplished by lp.LpVariable.dicts(). This is safer than using a standard array of variables.
AFAIK, PuLP can only handle a single scalar bound in the LpVariable statement (e.g. all lower bounds for a variables are zero). This means: we have to specify a number of bounds as explicit singleton constraints or use a var.bounds statement.

The results look like:

----- Data-------
      Element  Min   Max
ID                      
C      Carbon  2.0  3.00
Cu     Copper  0.4  0.60
Mn  Manganese  1.2  1.65
             Alloy    C    Cu   Mn  Stock  Price
ID                                              
A       Iron alloy  2.5   0.0  1.3   4000   1.20
B       Iron alloy  3.0   0.0  0.8   3000   1.50
C       Iron alloy  0.0   0.3  0.0   6000   0.90
D     Copper alloy  0.0  90.0  0.0   5000   1.30
E     Copper alloy  0.0  96.0  4.0   2000   1.45
F   Aluminum alloy  0.0   0.4  1.2   3000   1.20
G   Aluminum alloy  0.0   0.6  0.0   2500   1.00
----- Indices-------
Index(['A', 'B', 'C', 'D', 'E', 'F', 'G'], dtype='object', name='ID')
Index(['C', 'Cu', 'Mn'], dtype='object', name='ID')
----- Model Results-------
Status: Optimal
Objective: 5887.57427835
  Variable Index  Lower        Value    Upper
0      use     A    0.0  4000.000000  4000.00
1      use     B    0.0     0.000000  3000.00
2      use     C    0.0   397.763020  6000.00
3      use     D    0.0     0.000000  5000.00
4      use     E    0.0    27.612723  2000.00
5      use     F    0.0   574.624260  3000.00
6      use     G    0.0     0.000000  2500.00
7  content     C    2.0     2.000000     3.00
8  content    Cu    0.4     0.600000     0.60
9  content    Mn    1.2     1.200000     1.65

Safety

We will get an error if we misspell things. E.g. if we use in the second table Mgn instead of Mn, we will see:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Mn'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
9 frames
<ipython-input-3-d87ed8f34980> in <module>()
     59 # final content of elements and their bounds
     60 for j in Elements:
---> 61   model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])
     62   model += content[j] >= requirements.loc[j,'Min']
     63   model += content[j] <= requirements.loc[j,'Max']

<ipython-input-3-d87ed8f34980> in <listcomp>(.0)
     59 # final content of elements and their bounds
     60 for j in Elements:
---> 61   model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])
     62   model += content[j] >= requirements.loc[j,'Min']
     63   model += content[j] <= requirements.loc[j,'Max']

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __getitem__(self, key)
   1416                 except (KeyError, IndexError, AttributeError):
   1417                     pass
-> 1418             return self._getitem_tuple(key)
   1419         else:
   1420             # we by definition only have the 0th axis

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_tuple(self, tup)
    803     def _getitem_tuple(self, tup):
    804         try:
--> 805             return self._getitem_lowerdim(tup)
    806         except IndexingError:
    807             pass

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_lowerdim(self, tup)
    959                     return section
    960                 # This is an elided recursive call to iloc/loc/etc'
--> 961                 return getattr(section, self.name)[new_key]
    962 
    963         raise IndexingError("not applicable")

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __getitem__(self, key)
   1422 
   1423             maybe_callable = com.apply_if_callable(key, self.obj)
-> 1424             return self._getitem_axis(maybe_callable, axis=axis)
   1425 
   1426     def _is_scalar_access(self, key: Tuple):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1848         # fall thru to straight lookup
   1849         self._validate_key(key, axis)
-> 1850         return self._get_label(key, axis=axis)
   1851 
   1852 

/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _get_label(self, label, axis)
    154             # but will fail when the index is not present
    155             # see GH5667
--> 156             return self.obj._xs(label, axis=axis)
    157         elif isinstance(label, tuple) and isinstance(label[axis], slice):
    158             raise IndexingError("no slices here, handle elsewhere")

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in xs(self, key, axis, level, drop_level)
   3735             loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
   3736         else:
-> 3737             loc = self.index.get_loc(key)
   3738 
   3739             if isinstance(loc, np.ndarray):

/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2897                 return self._engine.get_loc(key)
   2898             except KeyError:
-> 2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2900         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2901         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'Mn'

Pulp is not giving a very well-formed error message here (not sure if Pulp can actually do that -- Python is issuing this before Pulp sees what is happening). But at least we are alerted (rather heavy-handedly) there is something wrong when we use Mn. Careful inspection of the stack frame shows we have a problem in the constraint

model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])

This error is actually generating an exception inside an exception handler!

Of course a much better and simpler error message would be: "supplyData.loc["A","Mn"]: Column "Mn" not found in data frame supplyData." IMHO programmers do not pay enough attention to provide meaningful error messages.

Solver log

The default solver is CBC via a DLL call. I don't think it is possible to see the solver log using this setup. I prefer to see the solver log, just to make sure there are no surprises. For this I used:

model.solve(lp.COIN_CMD(msg=1))

This will call the CBC executable (via an MPS file) and it will show the CBC log:

Welcome to the CBC MILP Solver
Version: 2.9
Build Date: Jan  6 2019

command line - cbc.exe fbbd73baa2494109b8e990ce26eb79b6-pulp.mps branch printingOptions all solution fbbd73baa2494109b8e990ce26eb79b6-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 22 COLUMNS
At line 64 RHS
At line 82 BOUNDS
At line 83 ENDATA
Problem MODEL has 17 rows, 10 columns and 34 elements
Coin0008I MODEL read with 0 errors
Presolve 4 (-13) rows, 7 (-3) columns and 18 (-16) elements
0  Obj 479.87991 Primal inf 10199.217 (4)
3  Obj 5887.5743
Optimal - objective value 5887.5743
After Postsolve, objective 5887.5743, infeasibilities - dual 1275.5592 (2), primal 0 (0)
Presolved model was optimal, full model needs cleaning up
Optimal - objective value 5887.5743
Optimal objective 5887.574275 - 3 iterations time 0.012, Presolve 0.00
Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.03   (Wallclock seconds):       0.03

The solver log shows that the presolver removes 13 of the 17 rows. This high reduction rate is related to the singleton constraints. When we look at the model we generated 3+3+7=13 bound constraints. The presolver is getting rid of these and makes proper bounds of them.

Note that: msg=1 may not always work when running as a Jupyter notebook.

Debugging

For debugging PuLP models, I recommend:

print(model). Printing the model shows how PuLP interpreted the constraints.
Writing an LP file: model.writeLP("steel.lp").

The output of print(model) is:

Steel:
MINIMIZE
1.2*Use_A + 1.5*Use_B + 0.9*Use_C + 1.3*Use_D + 1.45*Use_E + 1.2*Use_F + 1.0*Use_G + 0.0
SUBJECT TO
_C1: Use_A <= 4000

_C2: Use_B <= 3000

_C3: Use_C <= 6000

_C4: Use_D <= 5000

_C5: Use_E <= 2000

_C6: Use_F <= 3000

_C7: Use_G <= 2500

_C8: 5000 Content_C - 2.5 Use_A - 3 Use_B = 0

_C9: Content_C >= 2

_C10: Content_C <= 3

_C11: 5000 Content_Cu - 0.3 Use_C - 90 Use_D - 96 Use_E - 0.4 Use_F
 - 0.6 Use_G = 0

_C12: Content_Cu >= 0.4

_C13: Content_Cu <= 0.6

_C14: 5000 Content_Mn - 1.3 Use_A - 0.8 Use_B - 4 Use_E - 1.2 Use_F = 0

_C15: Content_Mn >= 1.2

_C16: Content_Mn <= 1.65

_C17: Use_A + Use_B + Use_C + Use_D + Use_E + Use_F + Use_G = 5000

VARIABLES
Content_C Continuous
Content_Cu Continuous
Content_Mn Continuous
Use_A Continuous
Use_B Continuous
Use_C Continuous
Use_D Continuous
Use_E Continuous
Use_F Continuous
Use_G Continuous

The LP file looks like:

\* Steel *\
Minimize
OBJ: 1.2 Use_A + 1.5 Use_B + 0.9 Use_C + 1.3 Use_D + 1.45 Use_E + 1.2 Use_F
 + Use_G
Subject To
_C1: Use_A <= 4000
_C10: Content_C <= 3
_C11: 5000 Content_Cu - 0.3 Use_C - 90 Use_D - 96 Use_E - 0.4 Use_F
 - 0.6 Use_G = 0
_C12: Content_Cu >= 0.4
_C13: Content_Cu <= 0.6
_C14: 5000 Content_Mn - 1.3 Use_A - 0.8 Use_B - 4 Use_E - 1.2 Use_F = 0
_C15: Content_Mn >= 1.2
_C16: Content_Mn <= 1.65
_C17: Use_A + Use_B + Use_C + Use_D + Use_E + Use_F + Use_G = 5000
_C2: Use_B <= 3000
_C3: Use_C <= 6000
_C4: Use_D <= 5000
_C5: Use_E <= 2000
_C6: Use_F <= 3000
_C7: Use_G <= 2500
_C8: 5000 Content_C - 2.5 Use_A - 3 Use_B = 0
_C9: Content_C >= 2
End

The information is basically the same, but the ordering of the rows is a bit different.

Comparison to CVXPY

We can try to model and solve the same problem using CVXPY. CVXPY is matrix oriented, so very different than PuLP. Here is my attempt:

from io import StringIO
import pandas as pd
import numpy as np
import cvxpy as cp

# for inputting tabular data below
def table(s):
  return pd.read_csv(StringIO(s),sep='\s+',index_col='ID')

#------------------------------------------------------------------
# data
#------------------------------------------------------------------

demand = 5000

requirements = table("""
   ID  Element      Min   Max
   C   Carbon       2     3
   Cu  Copper       0.4   0.6
   Mn  Manganese    1.2   1.65
    """)

supplyData = table("""
  ID  Alloy             C       Cu     Mn     Stock   Price
  A   "Iron alloy"      2.50    0.00   1.30   4000    1.20
  B   "Iron alloy"      3.00    0.00   0.80   3000    1.50
  C   "Iron alloy"      0.00    0.30   0.00   6000    0.90
  D   "Copper alloy"    0.00   90.00   0.00   5000    1.30
  E   "Copper alloy"    0.00   96.00   4.00   2000    1.45
  F   "Aluminum alloy"  0.00    0.40   1.20   3000    1.20
  G   "Aluminum alloy"  0.00    0.60   0.00   2500    1.00
  """)

print("----- Data-------")
print(requirements)
print(supplyData)

#------------------------------------------------------------------
# derived data
#------------------------------------------------------------------

# our sets are stockItems ["A","B",..] and elements ["C","Cu",...] 
Items = supplyData.index
Elements = requirements.index

# extract arrays (make sure order is identical)
Min = requirements.loc[Elements,"Min"]
Max = requirements.loc[Elements,"Max"]
Cost = supplyData.loc[Items,"Price"]
Avail = supplyData.loc[Items,"Stock"]
Element = supplyData.loc[Items,Elements]

# counts
NumItems = np.shape(Items)[0]
NumElements  = np.shape(Elements)[0]

# reshape into proper Numpy column vectors to make cvxpy happy
Min = np.reshape(Min.to_numpy(),(NumElements,1))
Max = np.reshape(Max.to_numpy(),(NumElements,1))
Cost = np.reshape(Cost.to_numpy(),(NumItems,1))
Avail = np.reshape(Avail.to_numpy(),(NumItems,1))
Element = Element.to_numpy()

#------------------------------------------------------------------
# LP Model
#------------------------------------------------------------------

use = cp.Variable((NumItems,1),"Use",nonneg=True)
content = cp.Variable((NumElements,1),"Content",nonneg=True)

model = cp.Problem(cp.Minimize(Cost.T @ use),
                   [cp.sum(use) == demand,
                    cp.multiply(demand,content) == Element.T @ use,  
                    content >= Min,
                    content <= Max,
                    use <= Avail                    
                   ])

#------------------------------------------------------------------
# Solve and reporting
#------------------------------------------------------------------

model.solve(solver=cp.ECOS,verbose=True)

print("----- Model Results-------")
print("status:",model.status)
print("objective:",model.value)
results = pd.DataFrame({'variable':'use', 
                        'index': Items, 
                        'lower':0, 
                        'level':use.value.flatten(),
                        'upper':Avail.flatten()
                        })
results = results.append(pd.DataFrame({'variable':'content', 
                        'index': Elements, 
                        'lower':Min.flatten(), 
                        'level':content.value.flatten(),
                        'upper':Max.flatten()
                        }))
print(results)

Notes:

I did my best to make sure that the ordering of rows and columns in the data frames is not significant.
We convert the information in the data frames to standard NumPy arrays for the benefit of CVXPY. (A column in a dataframe is a pandas series).
If we don't do proper shaping of the arrays, we may see error messages like: ValueError: Cannot broadcast dimensions (3,) (3, 1)
The model is compact, but we needed to put more effort in data extraction. In optimization, it is not at all unusual that data stuff takes more effort than the model equations.

The results look like:

----- Data-------
      Element  Min   Max
ID                      
C      Carbon  2.0  3.00
Cu     Copper  0.4  0.60
Mn  Manganese  1.2  1.65
             Alloy    C    Cu   Mn  Stock  Price
ID                                              
A       Iron alloy  2.5   0.0  1.3   4000   1.20
B       Iron alloy  3.0   0.0  0.8   3000   1.50
C       Iron alloy  0.0   0.3  0.0   6000   0.90
D     Copper alloy  0.0  90.0  0.0   5000   1.30
E     Copper alloy  0.0  96.0  4.0   2000   1.45
F   Aluminum alloy  0.0   0.4  1.2   3000   1.20
G   Aluminum alloy  0.0   0.6  0.0   2500   1.00

ECOS 2.0.7 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS

It     pcost       dcost      gap   pres   dres    k/t    mu     step   sigma     IR    |   BT
 0  +6.293e+03  -4.362e+04  +9e+04  1e-01  8e-02  1e+00  4e+03    ---    ---    1  1  - |  -  - 
 1  +5.462e+03  -6.110e+04  +7e+04  2e-01  5e-02  2e+03  3e+03  0.5361  8e-01   0  0  0 |  0  0
 2  +5.497e+03  +1.418e+03  +7e+03  1e-02  4e-03  5e+02  3e+02  0.9313  3e-02   0  0  0 |  0  0
 3  +4.981e+03  +3.654e+03  +2e+03  4e-03  1e-03  2e+02  1e+02  0.6947  5e-02   0  0  0 |  0  0
 4  +5.687e+03  +3.974e+03  +2e+03  7e-03  9e-04  3e+02  9e+01  0.4022  7e-01   0  0  0 |  0  0
 5  +5.653e+03  +5.326e+03  +5e+02  1e-03  2e-04  2e+01  2e+01  0.9127  1e-01   0  0  0 |  0  0
 6  +5.692e+03  +5.535e+03  +2e+02  5e-04  8e-05  1e+01  1e+01  0.5874  1e-01   0  0  0 |  0  0
 7  +5.791e+03  +5.642e+03  +2e+02  6e-04  5e-05  2e+01  7e+00  0.7361  5e-01   0  0  0 |  0  0
 8  +5.843e+03  +5.798e+03  +6e+01  2e-04  2e-05  5e+00  2e+00  0.9890  4e-01   0  0  0 |  0  0
 9  +5.886e+03  +5.883e+03  +4e+00  1e-05  1e-06  3e-01  2e-01  0.9454  1e-02   0  0  0 |  0  0
10  +5.888e+03  +5.888e+03  +5e-02  2e-07  3e-08  4e-03  2e-03  0.9890  2e-03   0  0  0 |  0  0
11  +5.888e+03  +5.888e+03  +6e-04  2e-09  5e-10  5e-05  2e-05  0.9890  1e-04   1  0  0 |  0  0
12  +5.888e+03  +5.888e+03  +6e-06  2e-11  6e-12  5e-07  3e-07  0.9890  1e-04   1  0  0 |  0  0

OPTIMAL (within feastol=1.9e-11, reltol=1.1e-09, abstol=6.3e-06).
Runtime: 0.000566 seconds.

----- Model Results-------
status: optimal
objective: 5887.574272281105
  variable index  lower         level    upper
0      use     A    0.0  4.000000e+03  4000.00
1      use     B    0.0  1.283254e-06  3000.00
2      use     C    0.0  3.977630e+02  6000.00
3      use     D    0.0  5.135476e-07  5000.00
4      use     E    0.0  2.761272e+01  2000.00
5      use     F    0.0  5.746243e+02  3000.00
6      use     G    0.0  5.163966e-06  2500.00
0  content     C    2.0  2.000000e+00     3.00
1  content    Cu    0.4  5.999999e-01     0.60
2  content    Mn    1.2  1.200000e+00     1.65

The results are not rounded. That often means that the solution of an interior point algorithm looks a bit ugly. In essence this is the same solution as we found with PuLP/CBC.

Comparison to GAMS

The GAMS model is closer to PuLP:

*---------------------------------------------------
* data
*---------------------------------------------------

set
  i 'items from inventory' /A*G/
  j 'elements' /C,Cu,Mn/
;

table requirements(j,*)
         Min   Max
   C     2     3
   Cu    0.4   0.6
   Mn    1.2   1.65
;

table supplyData(i,*)
       C      Cu     Mn     Stock   Price
  A    2.5            1.3    4000    1.20
  B    3              0.8    3000    1.50
  C            0.3           6000    0.90
  D           90             5000    1.30
  E           96      4      2000    1.45
  F            0.40   1.20   3000    1.20
  G            0.60          2500    1.00
;

scalar demand /5000/;

*---------------------------------------------------
* LP Model
*---------------------------------------------------

positive variables
    use(i)       'usage of raw material'
    content(j)   'characteristics of final product'
;


* bounds
use.up(i) = supplyData(i,'Stock');
content.lo(j) = requirements(j,'Min');
content.up(j) = requirements(j,'Max');

variable totalCost 'objective variable';

equations
    obj  'objective'
    calcContent(j) 'calculate contents of final product'
    meetDemand 'total demand must be met'
;

obj.. totalCost =e= sum(i, use(i)*supplyData(i,'Price'));

calcContent(j).. demand*content(j) =e= sum(i, use(i)*supplyData(i,j));

meetDemand.. sum(i, use(i)) =e= demand;

model blending /all/;

*---------------------------------------------------
* Solve and reporting
*---------------------------------------------------

solve blending minimizing totalCost using lp;

parameter results(*,*,*);
results('use',i,'lower') = 0;
results('use',i,'level') = use.l(i);
results('use',i,'upper') = supplyData(i,'Stock');
results('content',j,'lower') = requirements(j,'Min');
results('content',j,'level') = content.l(j);
results('content',j,'upper') = requirements(j,'Max');
display results;

Notes:

In GAMS we start with the sets, followed by the data. With external data we can extract sets from data.
The rows and columns in the tables are not order dependent: we can change the order without changing the meaning of the model.
We can specify bounds directly: no need for singleton constraints.
The statement results('use',i,'lower') = 0; is not really needed as the default value is zero. (To be more precise: all data is stored sparse, where "does not exist" is the same as zero). However, this statement forces the set element 'lower' to be before 'level' and 'upper'. Another way to enforce this, is to introduce a dummy set: set dummy /lower,level,upper/. We don't have to use the set. Just the declaration would instill an element ordering.
I used * in the data tables. This is conveniently disabling domain checking, but it is also dangerous. In this case, if we change in the supplyData table "Mn" to "Mng", we will not see an error. It would be safer to use full domain checking, which requires to add a few sets.

The safe version of the data part in GAMS would be:

set
  i    'items from inventory' /A*G/
  d    'demand limits' /Min,Max/
  s    'supply data' /C,Cu,Mn,Stock,Price/
  j(s) 'elements' /C,Cu,Mn/
;

table requirements(j,d)
         Min   Max
   C     2     3
   Cu    0.4   0.6
   Mn    1.2   1.65
;

table supplyData(i,s)
       C      Cu     Mng    Stock   Price
  A    2.5            1.3    4000    1.20
  B    3              0.8    3000    1.50
  C            0.3           6000    0.90
  D           90             5000    1.30
  E           96      4      2000    1.45
  F            0.40   1.20   3000    1.20
  G            0.60          2500    1.00
;

Here the typo "Mng" will immediately give a good error message:

  20  table supplyData(i,s)
  21         C      Cu     Mng    Stock   Price
****                         $170
**** 170  Domain violation for element

For production models it is best to use domain checking throughout.

IMHO, this GAMS model looks a bit more streamlined than the Python models.

The output looks like:

----     70 PARAMETER results  

                 lower       level       upper

use    .A                 4000.000    4000.000
use    .B                             3000.000
use    .C                  397.763    6000.000
use    .D                             5000.000
use    .E                   27.613    2000.000
use    .F                  574.624    3000.000
use    .G                             2500.000
content.C        2.000       2.000       3.000
content.Cu       0.400       0.600       0.600
content.Mn       1.200       1.200       1.650

References

Translating a LP from Excel to Python, https://stackoverflow.com/questions/59579342/translating-a-lp-from-excel-to-python-pulp

Saturday, October 26, 2024

Thursday, May 9, 2024

Thursday, February 8, 2024

Wednesday, September 20, 2023

Wednesday, April 5, 2023

Monday, May 16, 2022

Tuesday, August 31, 2021

Inverting a dense matrix

Sunday, August 8, 2021

Wednesday, January 6, 2021

Sunday, October 18, 2020

Conclusions

References

Thursday, April 23, 2020

Random locations

Crazy indexing

Python loop

R to the rescue

GDX file is slower

Saturday, January 4, 2020

Problem data

Mathematical Model

Implementation in Python/Pulp

Safety

Solver log

Debugging

Comparison to CVXPY

Comparison to GAMS

References