In [1] a small blending problem is proposed.
There are different raw steel materials that have to be mixed (blended) into a final material that has certain specifications. In this case the specifications are limits on the elements Carbon (C), Copper (Cu) and Manganese (Mn). We assume things blend linearly.
Blending is a traditional linear programming application, and models are found in many text books.
The problem is small so let's try PuLP here.
I'll try to write a bit about indexing using strings (instead of integers), and compare PuLP with CVXPY and GAMS. As the model is small, I'll also add some data manipulation (using data frames) and some simple reporting (also using data frames). I use string indexing in Pulp (with GAMS this is standard). Of course CVXPY is very different: it is matrix based. So there things are position dependent. The idea it to extract data from the data frame in such a way that positions are predictable.
Problem data
The data for the problem is as follows:
Demand: 5000 Kg Specification of final material: Element %Minimum %Max Carbon 2 3 Copper 0.4 0.6 Manganese 1.2 1.65 Raw material inventory: Alloy C% Cu% Mn% Stocks kg Price € / kg Iron alloy 2.50 0.00 1.30 4000 1.20 Iron alloy 3.00 0.00 0.80 3000 1.50 Iron alloy 0.00 0.30 0.00 6000 0.90 Copper alloy 0.00 90.00 0.00 5000 1.30 Copper alloy 0.00 96.00 4.00 2000 1.45 Aluminum alloy 0.00 0.40 1.20 3000 1.20 Aluminum alloy 0.00 0.60 0.00 2,500 1.00
Mathematical Model
The basic model is:
Blending Model |
---|
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i\cdot \color{darkred}{\mathit{Use}}_i\\ & \color{darkblue}{\mathit{Min}}_j \le \frac{\displaystyle\sum_i \color{darkblue}{\mathit{Element}}_{i,j}\cdot \color{darkred}{\mathit{Use}}_i}{\displaystyle \sum_i \color{darkred}{\mathit{Use}}_i} \le \color{darkblue}{\mathit{Max}}_j \\ & \sum_i \color{darkred}{\mathit{Use}}_i = \color{darkblue}{\mathit{Demand}} \\ & 0 \le \color{darkred}{\mathit{Use}}_i \le \color{darkblue}{\mathit{Available}}_i \end{align} \] |
Here we use: \[\begin{cases} i & \text{Types of raw material in stock}\\ j & \text{Element with limits}\\ {\mathit{Cost}}_i & \text{Unit cost of raw material} \\ {\mathit{Min}}_j, {\mathit{Max}}_j & \text{Limits on element content in final product} \\ {\mathit{Element}}_{i,j} & \text{Content of elements in raw material}\\ {\mathit{Demand}} & \text{Demand for final product}\\ {\mathit{Available}}_i & \text{Availability of raw material} \\ {\mathit{Use}}_i & \text{Decision variable: how much raw material to use} \end{cases}\] The blending constraint is nonlinear: we divide by the total weight of the final product to calculate the percentages. We can linearize this fraction in two ways:
- multiply all sides by \(\sum_i \mathit{Use}_i\). This leads to \[\mathit{Min}_j \cdot\sum_i \mathit{Use}_i \le \sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \le \mathit{Max}_j \sum_i \mathit{Use}_i\] This first reformulation is especially useful when the total final product is not constant. Note that this formulation is sometimes difficult to recognize when rewritten as something like: \[\begin{cases} \displaystyle \sum_i (\mathit{Element}_{i,j}-\mathit{Min}_j )\mathit{Use}_i \ge 0 \>\>\forall j \\ \displaystyle \sum_i (\mathit{Max}_j - \mathit{Element}_{i,j})\mathit{Use}_i \ge 0 \>\>\forall j \end{cases}\]
- In our case we know that \(\sum_i \mathit{Use}_i\) is constant: it is always equal to \(\mathit{Demand}\). I.e. \[ \mathit{Min}_j \le \frac{1}{\mathit{Demand}} \sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \le \mathit{Max}_j\]
We often need to split sandwich equations into two simple inequalities. That often leads to duplicate expressions: \[\begin{cases} \displaystyle \frac{1}{\mathit{Demand}}\sum_i \mathit{Element}_{i,j}\cdot\mathit{Use}_i \ge \mathit{Min}_j \>\>\forall j \\ \displaystyle \frac{1}{\mathit{Demand}}\sum_i \mathit{Element}_{i,j} \cdot\mathit{Use}_i \le \mathit{Max}_j \>\>\forall j \end{cases}\] For small problems this is not an issue. For larger problems, I prefer to introduce extra variables that prevents these duplicate expressions. We end up with the following linear programming formulation:
Linear Programming Formulation |
---|
\[\begin{align}\min & \sum_i \color{darkblue}{\mathit{Cost}}_i \cdot\color{darkred}{\mathit{Use}}_i\\ & \color{darkblue}{\mathit{Demand}} \cdot \color{darkred}{\mathit{Content}}_j = \sum_i \color{darkblue}{\mathit{Element}}_{i,j} \cdot\color{darkred}{\mathit{Use}}_i \\ & \sum_i \color{darkred}{\mathit{Use}}_i = \color{darkblue}{\mathit{Demand}} \\ & \color{darkred}{\mathit{Use}}_i \in [0, \color{darkblue}{\mathit{Available}}_i]\\ & \color{darkred}{\mathit{Content}}_j \in [\color{darkblue}{\mathit{Min}}_j,\color{darkblue}{\mathit{Max}}_j] \end{align} \] |
Even for small, almost trivial models, it makes sense to first develop a mathematical model. Especially, if you are not very experienced in developing linear programming models. Starting with a pen and a piece of paper is sometimes better than immediately start coding.
Implementation in Python/Pulp
An implementation using PuLP can look like:
from io import StringIO import pandas as pd import pulp as lp # for inputting tabular data below def table(s): return pd.read_csv(StringIO(s),sep='\s+',index_col='ID') #------------------------------------------------------------------ # data #------------------------------------------------------------------ demand = 5000 requirements = table(""" ID Element Min Max C Carbon 2 3 Cu Copper 0.4 0.6 Mn Manganese 1.2 1.65 """) supplyData = table(""" ID Alloy C Cu Mn Stock Price A "Iron alloy" 2.50 0.00 1.30 4000 1.20 B "Iron alloy" 3.00 0.00 0.80 3000 1.50 C "Iron alloy" 0.00 0.30 0.00 6000 0.90 D "Copper alloy" 0.00 90.00 0.00 5000 1.30 E "Copper alloy" 0.00 96.00 4.00 2000 1.45 F "Aluminum alloy" 0.00 0.40 1.20 3000 1.20 G "Aluminum alloy" 0.00 0.60 0.00 2500 1.00 """) print("----- Data-------") print(requirements) print(supplyData) #------------------------------------------------------------------ # derived data #------------------------------------------------------------------ # our sets are stockItems ["A","B",..] and elements ["C","Cu",...] Items = supplyData.index Elements = requirements.index print("----- Indices-------") print(Items) print(Elements) #------------------------------------------------------------------ # LP Model #------------------------------------------------------------------ use = lp.LpVariable.dicts("Use",Items,0,None,cat='Continuous') content = lp.LpVariable.dicts("Content",Elements,0,None,cat='Continuous') model = lp.LpProblem("Steel", lp.LpMinimize) # objective : minimize cost model += lp.lpSum([use[i]*supplyData.loc[i,'Price'] for i in Items ]) # upper bounds wrt availability for i in Items: model += use[i] <= supplyData.loc[i,'Stock'] # final content of elements and their bounds for j in Elements: model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items]) model += content[j] >= requirements.loc[j,'Min'] model += content[j] <= requirements.loc[j,'Max'] # meet demand model += lp.lpSum([use[i] for i in Items]) == demand # for debugging #print(model) #------------------------------------------------------------------ # Solve and reporting #------------------------------------------------------------------ model.solve() print("----- Model Results-------") print("Status:", lp.LpStatus[model.status]) print("Objective:",lp.value(model.objective)) # collect results L = [] for i in Items: L.append(['use',i,0.0,use[i].varValue,supplyData.loc[i,'Stock']]) for j in Elements: L.append(['content',j,requirements.loc[j,'Min'],content[j].varValue,requirements.loc[j,'Max']]) results = pd.DataFrame(L,columns=['Variable','Index','Lower','Value','Upper']) print(results)
Notes:
- We input the basic data as data frames. Data frames are a standard way to handle tabular data. Data frames are originally from the R statistical software system.
- Usually read_csv is for CSV files. Here we use it to read from a string. Blanks are used as separator to make the table more readable for humans.
- For each data frame we added an index column. This index will allow us to select a row from the data frame. Note that the index is a string. In general using strings as index is safer than using an index number. We see much earlier that things are wrong when making a mistake like using \(j\) (element) instead of \(i\) (raw material).
- Python Pandas allows duplicate indices. We can check for this using duplicated() function.
- Because we access the data by name, it would not matter if the rows or columns are in a different position. This is more like a database table, where we assume no particular ordering.
- We also use a data frame for reporting. Data frames are printed in a nicer way than Python arrays, and they can be exported to CVS files or spreadsheet with one function call.
- The variables are also indexed by names. This is accomplished by lp.LpVariable.dicts(). This is safer than using a standard array of variables.
- AFAIK, PuLP can only handle a single scalar bound in the LpVariable statement (e.g. all lower bounds for a variables are zero). This means: we have to specify a number of bounds as explicit singleton constraints or use a var.bounds statement.
The results look like:
----- Data------- Element Min Max ID C Carbon 2.0 3.00 Cu Copper 0.4 0.60 Mn Manganese 1.2 1.65 Alloy C Cu Mn Stock Price ID A Iron alloy 2.5 0.0 1.3 4000 1.20 B Iron alloy 3.0 0.0 0.8 3000 1.50 C Iron alloy 0.0 0.3 0.0 6000 0.90 D Copper alloy 0.0 90.0 0.0 5000 1.30 E Copper alloy 0.0 96.0 4.0 2000 1.45 F Aluminum alloy 0.0 0.4 1.2 3000 1.20 G Aluminum alloy 0.0 0.6 0.0 2500 1.00 ----- Indices------- Index(['A', 'B', 'C', 'D', 'E', 'F', 'G'], dtype='object', name='ID') Index(['C', 'Cu', 'Mn'], dtype='object', name='ID') ----- Model Results------- Status: Optimal Objective: 5887.57427835 Variable Index Lower Value Upper 0 use A 0.0 4000.000000 4000.00 1 use B 0.0 0.000000 3000.00 2 use C 0.0 397.763020 6000.00 3 use D 0.0 0.000000 5000.00 4 use E 0.0 27.612723 2000.00 5 use F 0.0 574.624260 3000.00 6 use G 0.0 0.000000 2500.00 7 content C 2.0 2.000000 3.00 8 content Cu 0.4 0.600000 0.60 9 content Mn 1.2 1.200000 1.65
Safety
KeyError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2896 try: -> 2897 return self._engine.get_loc(key) 2898 except KeyError: pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Mn' During handling of the above exception, another exception occurred: KeyError Traceback (most recent call last) 9 frames <ipython-input-3-d87ed8f34980> in <module>() 59 # final content of elements and their bounds 60 for j in Elements: ---> 61 model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items]) 62 model += content[j] >= requirements.loc[j,'Min'] 63 model += content[j] <= requirements.loc[j,'Max'] <ipython-input-3-d87ed8f34980> in <listcomp>(.0) 59 # final content of elements and their bounds 60 for j in Elements: ---> 61 model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items]) 62 model += content[j] >= requirements.loc[j,'Min'] 63 model += content[j] <= requirements.loc[j,'Max'] /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __getitem__(self, key) 1416 except (KeyError, IndexError, AttributeError): 1417 pass -> 1418 return self._getitem_tuple(key) 1419 else: 1420 # we by definition only have the 0th axis /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_tuple(self, tup) 803 def _getitem_tuple(self, tup): 804 try: --> 805 return self._getitem_lowerdim(tup) 806 except IndexingError: 807 pass /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_lowerdim(self, tup) 959 return section 960 # This is an elided recursive call to iloc/loc/etc' --> 961 return getattr(section, self.name)[new_key] 962 963 raise IndexingError("not applicable") /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in __getitem__(self, key) 1422 1423 maybe_callable = com.apply_if_callable(key, self.obj) -> 1424 return self._getitem_axis(maybe_callable, axis=axis) 1425 1426 def _is_scalar_access(self, key: Tuple): /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis) 1848 # fall thru to straight lookup 1849 self._validate_key(key, axis) -> 1850 return self._get_label(key, axis=axis) 1851 1852 /usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py in _get_label(self, label, axis) 154 # but will fail when the index is not present 155 # see GH5667 --> 156 return self.obj._xs(label, axis=axis) 157 elif isinstance(label, tuple) and isinstance(label[axis], slice): 158 raise IndexingError("no slices here, handle elsewhere") /usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in xs(self, key, axis, level, drop_level) 3735 loc, new_index = self.index.get_loc_level(key, drop_level=drop_level) 3736 else: -> 3737 loc = self.index.get_loc(key) 3738 3739 if isinstance(loc, np.ndarray): /usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance) 2897 return self._engine.get_loc(key) 2898 except KeyError: -> 2899 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2900 indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2901 if indexer.ndim > 1 or indexer.size > 1: pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 'Mn'
Pulp is not giving a very well-formed error message here (not sure if Pulp can actually do that -- Python is issuing this before Pulp sees what is happening). But at least we are alerted (rather heavy-handedly) there is something wrong when we use Mn. Careful inspection of the stack frame shows we have a problem in the constraint
model += demand*content[j] == lp.lpSum([use[i]*supplyData.loc[i,j] for i in Items])
This error is actually generating an exception inside an exception handler!
Of course a much better and simpler error message would be: "supplyData.loc["A","Mn"]: Column "Mn" not found in data frame supplyData." IMHO programmers do not pay enough attention to provide meaningful error messages.
Solver log
The default solver is CBC via a DLL call. I don't think it is possible to see the solver log using this setup. I prefer to see the solver log, just to make sure there are no surprises. For this I used:
model.solve(lp.COIN_CMD(msg=1))
This will call the CBC executable (via an MPS file) and it will show the CBC log:
Welcome to the CBC MILP Solver Version: 2.9 Build Date: Jan 6 2019 command line - cbc.exe fbbd73baa2494109b8e990ce26eb79b6-pulp.mps branch printingOptions all solution fbbd73baa2494109b8e990ce26eb79b6-pulp.sol (default strategy 1) At line 2 NAME MODEL At line 3 ROWS At line 22 COLUMNS At line 64 RHS At line 82 BOUNDS At line 83 ENDATA Problem MODEL has 17 rows, 10 columns and 34 elements Coin0008I MODEL read with 0 errors Presolve 4 (-13) rows, 7 (-3) columns and 18 (-16) elements 0 Obj 479.87991 Primal inf 10199.217 (4) 3 Obj 5887.5743 Optimal - objective value 5887.5743 After Postsolve, objective 5887.5743, infeasibilities - dual 1275.5592 (2), primal 0 (0) Presolved model was optimal, full model needs cleaning up Optimal - objective value 5887.5743 Optimal objective 5887.574275 - 3 iterations time 0.012, Presolve 0.00 Option for printingOptions changed from normal to all Total time (CPU seconds): 0.03 (Wallclock seconds): 0.03
The solver log shows that the presolver removes 13 of the 17 rows. This high reduction rate is related to the singleton constraints. When we look at the model we generated 3+3+7=13 bound constraints. The presolver is getting rid of these and makes proper bounds of them.
Note that: msg=1 may not always work when running as a Jupyter notebook.
Debugging
For debugging PuLP models, I recommend:
- print(model). Printing the model shows how PuLP interpreted the constraints.
- Writing an LP file: model.writeLP("steel.lp").
The output of print(model) is:
Steel: MINIMIZE 1.2*Use_A + 1.5*Use_B + 0.9*Use_C + 1.3*Use_D + 1.45*Use_E + 1.2*Use_F + 1.0*Use_G + 0.0 SUBJECT TO _C1: Use_A <= 4000 _C2: Use_B <= 3000 _C3: Use_C <= 6000 _C4: Use_D <= 5000 _C5: Use_E <= 2000 _C6: Use_F <= 3000 _C7: Use_G <= 2500 _C8: 5000 Content_C - 2.5 Use_A - 3 Use_B = 0 _C9: Content_C >= 2 _C10: Content_C <= 3 _C11: 5000 Content_Cu - 0.3 Use_C - 90 Use_D - 96 Use_E - 0.4 Use_F - 0.6 Use_G = 0 _C12: Content_Cu >= 0.4 _C13: Content_Cu <= 0.6 _C14: 5000 Content_Mn - 1.3 Use_A - 0.8 Use_B - 4 Use_E - 1.2 Use_F = 0 _C15: Content_Mn >= 1.2 _C16: Content_Mn <= 1.65 _C17: Use_A + Use_B + Use_C + Use_D + Use_E + Use_F + Use_G = 5000 VARIABLES Content_C Continuous Content_Cu Continuous Content_Mn Continuous Use_A Continuous Use_B Continuous Use_C Continuous Use_D Continuous Use_E Continuous Use_F Continuous Use_G Continuous
The LP file looks like:
\* Steel *\ Minimize OBJ: 1.2 Use_A + 1.5 Use_B + 0.9 Use_C + 1.3 Use_D + 1.45 Use_E + 1.2 Use_F + Use_G Subject To _C1: Use_A <= 4000 _C10: Content_C <= 3 _C11: 5000 Content_Cu - 0.3 Use_C - 90 Use_D - 96 Use_E - 0.4 Use_F - 0.6 Use_G = 0 _C12: Content_Cu >= 0.4 _C13: Content_Cu <= 0.6 _C14: 5000 Content_Mn - 1.3 Use_A - 0.8 Use_B - 4 Use_E - 1.2 Use_F = 0 _C15: Content_Mn >= 1.2 _C16: Content_Mn <= 1.65 _C17: Use_A + Use_B + Use_C + Use_D + Use_E + Use_F + Use_G = 5000 _C2: Use_B <= 3000 _C3: Use_C <= 6000 _C4: Use_D <= 5000 _C5: Use_E <= 2000 _C6: Use_F <= 3000 _C7: Use_G <= 2500 _C8: 5000 Content_C - 2.5 Use_A - 3 Use_B = 0 _C9: Content_C >= 2 End
The information is basically the same, but the ordering of the rows is a bit different.
Comparison to CVXPY
We can try to model and solve the same problem using CVXPY. CVXPY is matrix oriented, so very different than PuLP. Here is my attempt:
from io import StringIO import pandas as pd import numpy as np import cvxpy as cp # for inputting tabular data below def table(s): return pd.read_csv(StringIO(s),sep='\s+',index_col='ID') #------------------------------------------------------------------ # data #------------------------------------------------------------------ demand = 5000 requirements = table(""" ID Element Min Max C Carbon 2 3 Cu Copper 0.4 0.6 Mn Manganese 1.2 1.65 """) supplyData = table(""" ID Alloy C Cu Mn Stock Price A "Iron alloy" 2.50 0.00 1.30 4000 1.20 B "Iron alloy" 3.00 0.00 0.80 3000 1.50 C "Iron alloy" 0.00 0.30 0.00 6000 0.90 D "Copper alloy" 0.00 90.00 0.00 5000 1.30 E "Copper alloy" 0.00 96.00 4.00 2000 1.45 F "Aluminum alloy" 0.00 0.40 1.20 3000 1.20 G "Aluminum alloy" 0.00 0.60 0.00 2500 1.00 """) print("----- Data-------") print(requirements) print(supplyData) #------------------------------------------------------------------ # derived data #------------------------------------------------------------------ # our sets are stockItems ["A","B",..] and elements ["C","Cu",...] Items = supplyData.index Elements = requirements.index # extract arrays (make sure order is identical) Min = requirements.loc[Elements,"Min"] Max = requirements.loc[Elements,"Max"] Cost = supplyData.loc[Items,"Price"] Avail = supplyData.loc[Items,"Stock"] Element = supplyData.loc[Items,Elements] # counts NumItems = np.shape(Items)[0] NumElements = np.shape(Elements)[0] # reshape into proper Numpy column vectors to make cvxpy happy Min = np.reshape(Min.to_numpy(),(NumElements,1)) Max = np.reshape(Max.to_numpy(),(NumElements,1)) Cost = np.reshape(Cost.to_numpy(),(NumItems,1)) Avail = np.reshape(Avail.to_numpy(),(NumItems,1)) Element = Element.to_numpy() #------------------------------------------------------------------ # LP Model #------------------------------------------------------------------ use = cp.Variable((NumItems,1),"Use",nonneg=True) content = cp.Variable((NumElements,1),"Content",nonneg=True) model = cp.Problem(cp.Minimize(Cost.T @ use), [cp.sum(use) == demand, cp.multiply(demand,content) == Element.T @ use, content >= Min, content <= Max, use <= Avail ]) #------------------------------------------------------------------ # Solve and reporting #------------------------------------------------------------------ model.solve(solver=cp.ECOS,verbose=True) print("----- Model Results-------") print("status:",model.status) print("objective:",model.value) results = pd.DataFrame({'variable':'use', 'index': Items, 'lower':0, 'level':use.value.flatten(), 'upper':Avail.flatten() }) results = results.append(pd.DataFrame({'variable':'content', 'index': Elements, 'lower':Min.flatten(), 'level':content.value.flatten(), 'upper':Max.flatten() })) print(results)
Notes:
- I did my best to make sure that the ordering of rows and columns in the data frames is not significant.
- We convert the information in the data frames to standard NumPy arrays for the benefit of CVXPY. (A column in a dataframe is a pandas series).
- If we don't do proper shaping of the arrays, we may see error messages like: ValueError: Cannot broadcast dimensions (3,) (3, 1)
- The model is compact, but we needed to put more effort in data extraction. In optimization, it is not at all unusual that data stuff takes more effort than the model equations.
The results look like:
----- Data------- Element Min Max ID C Carbon 2.0 3.00 Cu Copper 0.4 0.60 Mn Manganese 1.2 1.65 Alloy C Cu Mn Stock Price ID A Iron alloy 2.5 0.0 1.3 4000 1.20 B Iron alloy 3.0 0.0 0.8 3000 1.50 C Iron alloy 0.0 0.3 0.0 6000 0.90 D Copper alloy 0.0 90.0 0.0 5000 1.30 E Copper alloy 0.0 96.0 4.0 2000 1.45 F Aluminum alloy 0.0 0.4 1.2 3000 1.20 G Aluminum alloy 0.0 0.6 0.0 2500 1.00 ECOS 2.0.7 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS It pcost dcost gap pres dres k/t mu step sigma IR | BT 0 +6.293e+03 -4.362e+04 +9e+04 1e-01 8e-02 1e+00 4e+03 --- --- 1 1 - | - - 1 +5.462e+03 -6.110e+04 +7e+04 2e-01 5e-02 2e+03 3e+03 0.5361 8e-01 0 0 0 | 0 0 2 +5.497e+03 +1.418e+03 +7e+03 1e-02 4e-03 5e+02 3e+02 0.9313 3e-02 0 0 0 | 0 0 3 +4.981e+03 +3.654e+03 +2e+03 4e-03 1e-03 2e+02 1e+02 0.6947 5e-02 0 0 0 | 0 0 4 +5.687e+03 +3.974e+03 +2e+03 7e-03 9e-04 3e+02 9e+01 0.4022 7e-01 0 0 0 | 0 0 5 +5.653e+03 +5.326e+03 +5e+02 1e-03 2e-04 2e+01 2e+01 0.9127 1e-01 0 0 0 | 0 0 6 +5.692e+03 +5.535e+03 +2e+02 5e-04 8e-05 1e+01 1e+01 0.5874 1e-01 0 0 0 | 0 0 7 +5.791e+03 +5.642e+03 +2e+02 6e-04 5e-05 2e+01 7e+00 0.7361 5e-01 0 0 0 | 0 0 8 +5.843e+03 +5.798e+03 +6e+01 2e-04 2e-05 5e+00 2e+00 0.9890 4e-01 0 0 0 | 0 0 9 +5.886e+03 +5.883e+03 +4e+00 1e-05 1e-06 3e-01 2e-01 0.9454 1e-02 0 0 0 | 0 0 10 +5.888e+03 +5.888e+03 +5e-02 2e-07 3e-08 4e-03 2e-03 0.9890 2e-03 0 0 0 | 0 0 11 +5.888e+03 +5.888e+03 +6e-04 2e-09 5e-10 5e-05 2e-05 0.9890 1e-04 1 0 0 | 0 0 12 +5.888e+03 +5.888e+03 +6e-06 2e-11 6e-12 5e-07 3e-07 0.9890 1e-04 1 0 0 | 0 0 OPTIMAL (within feastol=1.9e-11, reltol=1.1e-09, abstol=6.3e-06). Runtime: 0.000566 seconds. ----- Model Results------- status: optimal objective: 5887.574272281105 variable index lower level upper 0 use A 0.0 4.000000e+03 4000.00 1 use B 0.0 1.283254e-06 3000.00 2 use C 0.0 3.977630e+02 6000.00 3 use D 0.0 5.135476e-07 5000.00 4 use E 0.0 2.761272e+01 2000.00 5 use F 0.0 5.746243e+02 3000.00 6 use G 0.0 5.163966e-06 2500.00 0 content C 2.0 2.000000e+00 3.00 1 content Cu 0.4 5.999999e-01 0.60 2 content Mn 1.2 1.200000e+00 1.65
The results are not rounded. That often means that the solution of an interior point algorithm looks a bit ugly. In essence this is the same solution as we found with PuLP/CBC.
Comparison to GAMS
The GAMS model is closer to PuLP:
*--------------------------------------------------- * data *--------------------------------------------------- set i 'items from inventory' /A*G/ j 'elements' /C,Cu,Mn/ ; table requirements(j,*) Min Max C 2 3 Cu 0.4 0.6 Mn 1.2 1.65 ; table supplyData(i,*) C Cu Mn Stock Price A 2.5 1.3 4000 1.20 B 3 0.8 3000 1.50 C 0.3 6000 0.90 D 90 5000 1.30 E 96 4 2000 1.45 F 0.40 1.20 3000 1.20 G 0.60 2500 1.00 ; scalar demand /5000/; *--------------------------------------------------- * LP Model *--------------------------------------------------- positive variables use(i) 'usage of raw material' content(j) 'characteristics of final product' ; * bounds use.up(i) = supplyData(i,'Stock'); content.lo(j) = requirements(j,'Min'); content.up(j) = requirements(j,'Max'); variable totalCost 'objective variable'; equations obj 'objective' calcContent(j) 'calculate contents of final product' meetDemand 'total demand must be met' ; obj.. totalCost =e= sum(i, use(i)*supplyData(i,'Price')); calcContent(j).. demand*content(j) =e= sum(i, use(i)*supplyData(i,j)); meetDemand.. sum(i, use(i)) =e= demand; model blending /all/;
*--------------------------------------------------- * Solve and reporting *---------------------------------------------------
solve blending minimizing totalCost using lp; parameter results(*,*,*); results('use',i,'lower') = 0; results('use',i,'level') = use.l(i); results('use',i,'upper') = supplyData(i,'Stock'); results('content',j,'lower') = requirements(j,'Min'); results('content',j,'level') = content.l(j); results('content',j,'upper') = requirements(j,'Max'); display results;
Notes:
The output looks like:
- In GAMS we start with the sets, followed by the data. With external data we can extract sets from data.
- The rows and columns in the tables are not order dependent: we can change the order without changing the meaning of the model.
- We can specify bounds directly: no need for singleton constraints.
- The statement results('use',i,'lower') = 0; is not really needed as the default value is zero. (To be more precise: all data is stored sparse, where "does not exist" is the same as zero). However, this statement forces the set element 'lower' to be before 'level' and 'upper'. Another way to enforce this, is to introduce a dummy set: set dummy /lower,level,upper/. We don't have to use the set. Just the declaration would instill an element ordering.
- I used * in the data tables. This is conveniently disabling domain checking, but it is also dangerous. In this case, if we change in the supplyData table "Mn" to "Mng", we will not see an error. It would be safer to use full domain checking, which requires to add a few sets.
- The safe version of the data part in GAMS would be:
set i 'items from inventory' /A*G/ d 'demand limits' /Min,Max/ s 'supply data' /C,Cu,Mn,Stock,Price/ j(s) 'elements' /C,Cu,Mn/ ; table requirements(j,d) Min Max C 2 3 Cu 0.4 0.6 Mn 1.2 1.65 ; table supplyData(i,s) C Cu Mng Stock Price A 2.5 1.3 4000 1.20 B 3 0.8 3000 1.50 C 0.3 6000 0.90 D 90 5000 1.30 E 96 4 2000 1.45 F 0.40 1.20 3000 1.20 G 0.60 2500 1.00 ;
20 table supplyData(i,s) 21 C Cu Mng Stock Price **** $170 **** 170 Domain violation for element
For production models it is best to use domain checking throughout. - IMHO, this GAMS model looks a bit more streamlined than the Python models.
The output looks like:
---- 70 PARAMETER results lower level upper use .A 4000.000 4000.000 use .B 3000.000 use .C 397.763 6000.000 use .D 5000.000 use .E 27.613 2000.000 use .F 574.624 3000.000 use .G 2500.000 content.C 2.000 2.000 3.000 content.Cu 0.400 0.600 0.600 content.Mn 1.200 1.200 1.650
References
- Translating a LP from Excel to Python, https://stackoverflow.com/questions/59579342/translating-a-lp-from-excel-to-python-pulp
> Min = np.reshape(Min.to_numpy(),(NumElements,1))
ReplyDeleteYou can do this via Min = Min.values.reshape(-1,1)
Hello,
ReplyDeleteThanks for the great post. I have just started using PuLP and this helps a lot. One question: How do you get the solver log? In my case, even with the option msg=1, I don't get anything (I am using Spyder).
I will answer to myself: I get the log when I run the script from the command window.
Delete