Tuesday, August 31, 2021

matrix operations via GAMS Embedded Python: multi-regional Input-Output tables

Wassily Leontief in front of an IO table, credit:NYU


Inverting a dense matrix

GAMS allows running pieces of Python code as part of a GAMS model. Unfortunately, the interface is rather low-level and inefficient.  Usually "low-level" is associated with high-performance, but in the Python world, this is not the case. Here is an example.

Suppose we want to use the Python interface to invert a matrix \(A\).

The main GAMS file can look like:

$ontext

  
Invert a square, dense matrix.


$offtext


set i /i1*i1000/;
alias(i,j,k);

parameters
  a(i,j)
'square matrix A'
  b(j,i)
'B=inv(A)'
;
a(i,j) = 1/(
ord(i)+ord(j)-1);
a(i,i) = 1;
* This is a Hilbert matrix with diagonal replaced by ones.

$batinclude invert.inc A B i

display$(card(i)<50) B;

* check solution for smaller instances
parameters
   ident(i,j)
'identity matrix'
   maxerr    
'maximum error'
;

if (card(i)<=200,
   ident(i,i)=1;
   maxerr =
smax((i,j),abs(sum(k,A(i,k)*B(j,k))-ident(i,j)));
  
display maxerr;
);



Note: the check may look a bit strange. However, \(B\) is symmetric, so I swapped the indices. It can save a bit on the processing time.
 
The inversion code itself is packaged in an include file:


* invert.inc
* Arguments:
*    A: input matrix
*    B: output matrix (inverse or all zero)
*    i: sets defining A and B
* if singular return a parameter with all zeros


$offlisting

embeddedCode Python:

gams.printLog("\n### invert.inc")
import numpy as np
import numpy.linalg as la
import time

t0 = time.time()

sa =
"%1"  # name of A matrix
sb =
"%2"  # name of B=inv(A) matrix
si =
"%3"  # name of driving set

#
# load set i
#
I = list(gams.get(si,keyType=KeyType.INT))
n = len(I)
assert n>0, f">>>> set {si} has no elements"
assert isinstance(I[0],int),  f">>>> set {si} is not a 1 dim set"
imap = dict(zip(I,range(n))) 
# maps gams indices -> Python indices


#
# allocate and populate matrix A
#
A = np.zeros((n,n))
try:
  
for (i,j,aij) in gams.get(sa,keyType=KeyType.INT,keyFormat=KeyFormat.FLAT):
       A[imap[i],imap[j]] = aij
except:
  
assert False,f">>>> Error in set elements. Check the set {si}."

t1 = time.time()-t0
gams.printLog(f
".. Copied data from {si},{sa} ({t1:.1f}s)")


#
# invert
#
try:
  B = la.inv(A)
except la.LinAlgError:
  gams.printLog(
".. Singular matrix. Returning zeros.")
  B = np.zeros((n,n))

t2 = time.time()-t0-t1
gams.printLog(f
".. Inverted {sa} ({t2:.1f}s)")

#
# return B
#
GamsB = [((I[k[
0]],I[k[1]]),v) for k,v in np.ndenumerate(B) if abs(v)>1e-9]
if len(GamsB)==0:
   GamsB = [((I[
0],I[0]),0.0)]
gams.set(sb,GamsB)

t3 = time.time()-t0-t1-t2
gams.printLog(f
".. Created GAMS sparse matrix {sb} ({t3:.1f}s)")

t = t1+t2+t3
gams.printLog(f
".. Total elapsed {t:.1f}s")

endEmbeddedCode
%2



We can pass command line arguments to this include file so it functions like a batch file (hence the name batinclude). Not really a function, but similar functionality. The batinclude file contains embedded Python code that does the matrix inversion and also spends time on converting GAMS data to a Python numpy matrix and back. 

The performance of this piece of code is rather dismal. We see:
 

### invert.inc
--- .. Copied data from i,A (19.9s)
--- .. Inverted A (0.7s)
--- .. Created GAMS sparse matrix B (25.6s)
--- .. Total elapsed 46.2s


This means we have an overhead of \[\frac{46.2-0.7}{0.7}\times 100\% = 6,500\%\] We pay dearly for the privilege to use np.inv(A). The reason is that the interface is not using the right abstraction level. We get data that is in a form that needs much processing.

This was an artificial example, let's look at some real examples of applications related to projects I am working on.

The Leontief inverse


In the field of input-output modeling, economists are interested in the Leontief inverse (sometimes called total requirements matrix), or: \[(I-A)^{-1}\] I have a national IO table lying around for Japan, 1995 [2]. This table has the following 93 sectors or industries:


   001  'Crop cultivation'
   002  'Livestock and sericulture'
   003  'Agricultural services'
   004  'Forestry'
   005  'Fisheries'
   006  'Metallic ores'
   007  'Non-metallic ores'
   008  'Coal'
   009  'Crude petroleum and natural gas'
   010  'Foods'
   011  'Drinks'
   012  'Feeds and organic fertilizer, n.e.c.'
   013  'Tabacco'
   014  'Textile products'
   015  'Wearing apparel and other textile products'
   016  'Timber and wooden products'
   017  'Furniture and fixtures'
   018  'Pulp, paper, paperboard and processed paper'
   019  'Paper products'
   020  'Publishing and printing'
   021  'Chemical fertilizer'
   022  'Inorganic basic chemical products'
   023  'Petrochemical basic products and intermediate chemical products'
   024  'Synthetic resins'
   025  'Synthetic fibers'
   026  'Medicaments'
   027  'Final chemical products, n.e.c.'
   028  'Petroleum refinery products'
   029  'Coal products'
   030  'Plastic products'
   031  'Rubber products'
   032  'Leather, fur skins and miscellaneous leather products'
   033  'Glass and glass products'
   034  'Cement and cement products'
   035  'Pottery, china and earthenware'
   036  'Other ceramic, stone and clay products'
   037  'Pig iron and crude steel'
   038  'Steel products'
   039  'Steel castings and forgings and other steel products'
   040  'Non-ferrous metals'
   041  'Non-ferrous metal products'
   042  'Metal products for construction and architecture'
   043  'Other metal products'
   044  'General industrial machinery'
   045  'Special industrial machinery'
   046  'Other general machines'
   047  'Machinery for office and service industry'
   048  'Household electric appliance'
   049  'Electronic equipment and communication equipment'
   050  'Heavy electrical equipment'
   051  'Other electrical machinery'
   052  'Motor vehicles'
   053  'Ships and repair of ships'
   054  'Other transportation equipment and repair of transportation equipment'
   055  'Precision instruments'
   056  'Miscellaneous manufacturing products'
   057  'Construction'
   058  'Repair of constructions'
   059  'Civil engineering'
   060  'Electric power for enterprise use'
   061  'Gas and heat supply'
   062  'Water supply'
   063  'Waste disposal services'
   064  'Commerce'
   065  'Finance and insurance'
   066  'Real estate agencies and rental services'
   067  'House rent'
   068  'Railway transport'
   069  'Road transport(except transport by private cars)'
   070  'Transport by private cars'
   071  'Water transport'
   072  'Air transport'
   073  'Freight forwarding'
   074  'Storage facility services'
   075  'Services relating to transport'
   076  'Communication'
   077  'Broadcasting'
   078  'Public administration'
   079  'Education'
   080  'Research'
   081  'Medical service and health'
   082  'Social security'
   083  'Other public services'
   084  'Advertising, survey and information services'
   085  'Goods rental and leasing services'
   086  'Repair of motor vehicles and machine'
   087  'Other business services'
   088  'Amusement and recreational services'
   089  'Eating and drinking places'
   090  'Hotel and other lodging places'
   091  'Other personal services'
   092  'Office supplies'
   093  'Activities not elsewhere classified'

That is a bit smaller than our previous random matrix.

An IO table has roughly the following structure:


The main matrix elements of this transaction matrix can be defined as \[x_{i,j} = \text{output of industry $i$  sold to industry $j$}\] I.e. the rows are the sellers and the columns are the buyers [3].

The data needs a bit of preprocessing to calculate \(A\) and then \(I-A\) before we can invoke our matrix inversion code. The matrix \(A\) is called the matrix of input coefficients or technological coefficients, and can be formed by: \[a_{i,j} := \frac{x_{i,j}}{X_j}\]

The code to calculate the Leontief inverse \((I-A)^{-1}\) can look like:


$ontext

   
Calculate the Leontief inverse.

$offtext

set i 'sectors/industries' /001*093/;
alias(i,j);

singleton set totalProduction /115/;

table IOTable(*,*) 'complete IO table of 1995 Japan [millions yens]'
$include io93pro.inc
;

display IOTable;

parameter
   A(i,j)    
'input coefficients'
   Ident(i,j)
'Identity matrix'
   IA(i,j)   
'I-A'
   L(i,j)    
'Leontief inverse'
;
A(i,j) = IOTable(i,j)/IOTable(totalProduction,j);

Ident(i,i) = 1;
IA(i,j) = Ident(i,j)-A(i,j);

$batinclude invert.inc IA L i

display L;



This solves very fast as the matrix we need to invert is small: \(93 \times 93\). In the next section, let's have a look at regionalized IO tables. That is more interesting.

Leontief Inverse for Regional IO Tables


Regional Input-Output tables provide us with some challenges:
  • They tend to be big: the number of rows and columns of \(A\) is the number of regions times the number of sectors.
  • The indices for the rows and columns are 2-dimensional: regions and sectors. We either need to map this to a one-dimensional set or we need the matrix-inversion code to accept multi-dimensional sets for the rows and columns. 
Here we try to develop some code that can take a regional \(A\) matrix with multi-dimensional row and column indices (without the need to map this to a 1d set) and produces a Leontief inverse.

As an example, I use WIOT data [4]. This data set has 56 industries and 44 regions.

REGIONS
=======

     AUS      'Australia'
     AUT      'Austria'
     BEL      'Belgium'
     BGR      'Bulgaria'
     BRA      'Brazil'
     CAN      'Canada'
     CHE      'Switserland'
     CHN      'China'
     CYP      'Cyprus'
     CZE      'Czech Republic'
     DEU      'Germany'
     DNK      'Denmark'
     ESP      'Spain'
     EST      'Estonia'
     FIN      'Finland'
     FRA      'France'
     GBR      'UK'
     GRC      'Greece'
     HRV      'Croatia'
     HUN      'Hungary'
     IDN      'Indonesia'
     IND      'India'
     IRL      'Ireland'
     ITA      'Italy'
     JPN      'Japan'
     KOR      'South Korea'
     LTU      'Lithuania'
     LUX      'Luxembourg'
     LVA      'Latvia'
     MEX      'Mexico'
     MLT      'Malta'
     NLD      'The Netherlands'
     NOR      'Norway'
     POL      'Poland'
     PRT      'Portugal'
     ROU      'Romania'
     RUS      'Russia'
     SVK      'Slovak Republic'
     SVN      'Slovenia'
     SWE      'Sweden'
     TUR      'Turkey'
     TWN      'Taiwan'
     USA      'USA'
     ROW      'Rest of the world'


SECTORS
=======

     A01        'Crop and animal production, hunting and related service activities'
     A02        'Forestry and logging'
     A03        'Fishing and aquaculture'
     B          'Mining and quarrying'
     C10-C12    'Manufacture of food products, beverages and tobacco products'
     C13-C15    'Manufacture of textiles, wearing apparel and leather products'
     C16        'Manufacture of wood and of products of wood and cork, except furniture; manufacture of articles of straw and plaiting materials'
     C17        'Manufacture of paper and paper products'
     C18        'Printing and reproduction of recorded media'
     C19        'Manufacture of coke and refined petroleum products'
     C20        'Manufacture of chemicals and chemical products'
     C21        'Manufacture of basic pharmaceutical products and pharmaceutical preparations'
     C22        'Manufacture of rubber and plastic products'
     C23        'Manufacture of other non-metallic mineral products'
     C24        'Manufacture of basic metals'
     C25        'Manufacture of fabricated metal products, except machinery and equipment'
     C26        'Manufacture of computer, electronic and optical products'
     C27        'Manufacture of electrical equipment'
     C28        'Manufacture of machinery and equipment n.e.c.'
     C29        'Manufacture of motor vehicles, trailers and semi-trailers'
     C30        'Manufacture of other transport equipment'
     C31_C32    'Manufacture of furniture; other manufacturing'
     C33        'Repair and installation of machinery and equipment'
     D35        'Electricity, gas, steam and air conditioning supply'
     E36        'Water collection, treatment and supply'
     E37-E39    'Sewerage; waste collection, treatment and disposal activities; materials recovery; remediation activities and other waste management services'
     F          'Construction'
     G45        'Wholesale and retail trade and repair of motor vehicles and motorcycles'
     G46        'Wholesale trade, except of motor vehicles and motorcycles'
     G47        'Retail trade, except of motor vehicles and motorcycles'
     H49        'Land transport and transport via pipelines'
     H50        'Water transport'
     H51        'Air transport'
     H52        'Warehousing and support activities for transportation'
     H53        'Postal and courier activities'
     I          'Accommodation and food service activities'
     J58        'Publishing activities'
     J59_J60    'Motion picture, video and television programme production, sound recording and music publishing activities; programming and broadcasting activities'
     J61        'Telecommunications'
     J62_J63    'Computer programming, consultancy and related activities; information service activities'
     K64        'Financial service activities, except insurance and pension funding'
     K65        'Insurance, reinsurance and pension funding, except compulsory social security'
     K66        'Activities auxiliary to financial services and insurance activities'
     L68        'Real estate activities'
     M69_M70    'Legal and accounting activities; activities of head offices; management consultancy activities'
     M71        'Architectural and engineering activities; technical testing and analysis'
     M72        'Scientific research and development'
     M73        'Advertising and market research'
     M74_M75    'Other professional, scientific and technical activities; veterinary activities'
     N          'Administrative and support service activities'
     O84        'Public administration and defence; compulsory social security'
     P85        'Education'
     Q          'Human health and social work activities'
     R_S        'Other service activities'
     T          'Activities of households as employers; undifferentiated goods- and services-producing activities of households for own use'
     U          'Activities of extraterritorial organizations and bodies'


The resulting IO matrix is \(2464\times 2464\). Of course, the indexing will be a bit more complicated. The \(A\) matrix will look like \(a_{i,r,i',r'}\), i.e. a four-dimensional structure. To help with this, I created a batinclude file that given a matrix \(A\) and a one-, two- or more-dimensional index set, calculates the Leontief inverse matrix \((I-A)^{-1}\). 


Regional IO Table with two-dimensional rows and columns



First: here is the code (1) to read in the IO table, (2) to call the Leontief inverse code, and (3) to perform some checks. The checks are useful in themselves (the data is too large to eyeball, and there are many possibilities for things to go wrong along the way), but also to demonstrate how to write matrix operations over what are essentially tensors.


$ontext

   
Leontief inverse of a large regional IO table

  
Data from http://www.wiod.org/database/wiots16

$offtext

*----------------------------------------------------------------
* Convert spreadsheet and import GDX file
*----------------------------------------------------------------

$set data     WIOT2014_Nov16_ROW
$set xls      %data%.xlsb
$set gdx      %data%.gdx

* remove WIOT2014_Nov16_ROW.gdx to force rereading the spreadsheet

$if not exist %gdx% $call 'gdxxrw %xls% par=wiot rng=A3 rdim=2 cdim=2 ignorerows=4,6 ignorecolumns=B,D'


parameter WIOT(*,*,*,*) 'WIOT Intercountry Input-Output Table';
$gdxin %gdx%
$loaddc WIOT


*----------------------------------------------------------------
* Sets
*----------------------------------------------------------------

sets
   i
'industries' /
     
A01,A02,A03,B,C10-C12,C13-C15,C16,C17,C18,C19,C20,C21
     
C22,C23,C24,C25,C26,C27,C28,C29,C30,C31_C32,C33,D35,E36
     
E37-E39,F,G45,G46,G47,H49,H50,H51,H52,H53,I,J58,J59_J60
     
J61,J62_J63,K64,K65,K66,L68,M69_M70,M71,M72,M73,M74_M75
     
N,O84,P85,Q,R_S,T,U
   
/
   r
'regions (43 countries + ROW)' /
     
AUS,AUT,BEL,BGR,BRA,CAN,CHE,CHN,CYP,CZE,DEU,DNK,ESP
     
EST,FIN,FRA,GBR,GRC,HRV,HUN,IDN,IND,IRL,ITA,JPN,KOR
     
LTU,LUX,LVA,MEX,MLT,NLD,NOR,POL,PRT,ROU,RUS,SVK,SVN
     
SWE,TUR,TWN,USA,ROW
   
/
  ir(i,r)
;
ir(i,r) =
yes;
alias(i,ii,i2,i3),(r,rr,r2,r3),(ir,ir2,ir3);


*----------------------------------------------------------------
* do some counting
*----------------------------------------------------------------

parameter counts(*) 'industries,regions,combinations';
counts(
'i:sectors') = card(i);
counts(
'r:regions') = card(r);
counts(
'ir:combinations') = card(ir);
option counts:0;
display counts;

*----------------------------------------------------------------
* extract parameters
*----------------------------------------------------------------

set
  fd
'final demand'/
         
CONS_h  'Final consumption expenditure by households'
         
CONS_np 'Final consumption expenditure by non-profit organisations serving households (NPISH)'
         
CONS_g  'Final consumption expenditure by government'
         
GFCF    'Gross fixed capital formation'
         
INVEN   'Changes in inventories and valuables'
      
/
;

parameter
   finalDemand(i,r)
'final demand f'
   grossOutput(i,r)
'total'
   A(i,r,i,r)      
'technological coefficients'
   Ident(i,r,i,r)  
'Indentity matrix I'
;
finalDemand(ir) =
sum((fd,r),WIOT(ir,fd,r));
grossOutput(ir) = WIOT(ir,
'GO','TOT');
A(ir,ir2)$grossOutput(ir2) = WIOT(ir,ir2)/grossOutput(ir2);
Ident(i,r,i,r) = 1;

*----------------------------------------------------------------
* sanity check
*----------------------------------------------------------------

* x = Z*1 + f

parameter Xerr(i,r) 'error in identity x = Z*1 + f';
Xerr(ir) = grossOutput(ir) -
sum(ir2,WIOT(ir,ir2)) - finalDemand(ir);
abort$sum(ir$(abs(Xerr(ir))>1e-5),1) "Check Xerr";

*----------------------------------------------------------------
* form Leontief inverse
*----------------------------------------------------------------

parameter L(i,r,i,r) 'Leontief inverse';

$batinclude leontief.inc A L ir

*----------------------------------------------------------------
* some checks
*----------------------------------------------------------------

* check  (I-A) * inv(I-A) = I
* do a small subset of countries (a complete check would take a long time).
set rsub(r) /AUS,AUT/;
parameter InvErr(i,r,i,r) 'error in identity (I-A)*L=I';
InvErr(ir(i,r),ir2(i2,r2))$(rsub(r)
and rsub(r2)) =
   
sum(ir3, (Ident(ir,ir3)-A(ir,ir3))*L(ir3,ir2)) - Ident(ir,ir2);
abort$sum((ir,ir2)$(abs(InvErr(ir,ir2))>1e-5),1) "Check InvErr";

* check  x = inv(I-A)*f = L*f
parameter xInvErr(i,r) 'error in x = inv(I-A)*f';
xInvErr(ir) =
sum(ir2, L(ir,ir2)*finalDemand(ir2)) - GrossOutput(ir);
abort$sum(ir$(abs(XInvErr(ir))>1e-5),1) "Check XInvErr";



Here the two-dimensional set ir(i,r) is used to drive the inversion code. It is also used in writing the matrix expressions (to make the matrices look like 2d structures with rows and columns, e.g. A(ir,ir2)).
 
The batinclude file with the Python code for the Leontief inverse is:


$ontext

 
leontief.inc

 
Calculate the leontief inverse.

 
Arguments:
  
%1 : input matrix: square matrix with technological coefficients
  
%2 : output matrix: Leontief matrix
  
%3 : sets defining A and L
       
can have 1 or multiple dimensions

$offtext

$offlisting

embeddedCode Python:

gams.printLog("\n### leontief.inc")
import numpy as np
import numpy.linalg as la
import scipy.sparse as sp
import time

TRACE = True

# arguments
sA =
"%1"
sL =
"%2"
sI =
"%3"

t0 = time.time()

#
#  load set I
#
if TRACE: gams.printLog(f".. load set {sI}")
gmsI = list(gams.get(sI,keyType=KeyType.INT))
if TRACE: gams.printLog(f".... {gmsI[0:5]} (max 5 shown)")

#
#  get size and dimension
#
n = len(gmsI)
if TRACE: gams.printLog(f".. size of matrices=({n}x{n})")
assert n>0
if isinstance(gmsI[0],int):
   dim =
1
else:
   dim = len(gmsI[
0])
if TRACE: gams.printLog(f".. dim={dim} (dimension of row and column indices)")

#
#  load matrix
#
gams.printLog(f
".. load parameter {sA}")
gmsA = list(gams.get(sA,keyType=KeyType.INT,keyFormat=KeyFormat.FLAT))
if TRACE: gams.printLog(f".... {gmsA[0:3]} (max 3 shown)")

t1 = time.time()-t0
gams.printLog(f
".. loaded gams data {sI},{sA} ({t1:.1f}s)")


#
#  convert gmsA into something we can actually use
#

# set up dict for indices that maps GAMS numbers to Python indices
imap = dict(zip(gmsI,range(len(gmsI))))

# set up matrix.
gmsA2 = list(zip(*gmsA))
# unzip
try:
 
if dim==1:
    irow = [imap[k]
for k in gmsA2[0]]
    jcol = [imap[k]
for k in gmsA2[1]]
    aij  =  list(a2[
2])
 
else:
    irow = [imap[kk]
for kk in zip(*[gmsA2[k] for k in range(dim)])]
    jcol = [imap[kk]
for kk in zip(*[gmsA2[k] for k in range(dim,2*dim)])]
    aij  = list(gmsA2[
2*dim])
except:
   gams.printLog(f
">>>> Some indices are not in {sI}. Check the set {sI}.")
  
raise ValueError()

# create a dense numpy matrix I-A
npIA = np.eye(n) - sp.coo_matrix((aij,(irow,jcol)),shape=(n,n)).toarray()

t2 = time.time()-t0-t1
gams.printLog(f
".. Created matrix I-{sA} ({t2:.1f}s)")

#
# invert I-A
#
try:
  invIA = la.inv(npIA)
except la.LinAlgError:
  gams.printLog(
".. Singular matrix. Returning empty matrix.")
  invIA = np.zeros_like(npIA)

t3 = time.time()-t0-t1-t2
gams.printLog(f
".. Inverted matrix I-{sA} ({t3:.1f}s)")

#
# return inv(I-A)
#
if dim==1:
   listL = [((gmsI[i1],gmsI[j1]),invIA[i1,j1])
for i1 in range(n) for j1 in range(n)]
else:
   listL = [(tuple(gmsI[i1]+gmsI[j1]),invIA[i1,j1])
for i1 in range(n) for j1 in range(n)]

gams.set(sL,listL)

t4 = time.time()-t0-t1-t2-t3
gams.printLog(f
".. Created matrix {sL} ({t4:.1f}s)")


t = time.time()-t0
gams.printLog(f
".. elapsed {t:.1f}s")

endEmbeddedCode
%2



Again, the Python code is very slow so we spend a lot of time preparing the matrix for inversion. In this case, the inversion takes a couple of seconds, but the overhead is several minutes! 

Note that this last batinclude file uses a call to scipy.sparse.coo.matrix. So you need to install scipy into the GAMS Python environment. As it does not come with pip this is quite an adventure.


Conclusions


  • The embedded Python API is not well suited for algorithms that require a substantial amount of data to be exchanged with GAMS. It uses a low-level approach, something which is not compatible with Python. Python is just not designed to do large-scale low-level stuff. This is a design problem in the API: the wrong abstraction level was chosen. Data should be exchanged in higher-level data structures such as numpy arrays or Pandas dataframes using high-performance kernels. This will lead to performance gains of at least a factor of 10.
  • As a result, an implementation using the Python embedded API suffers from large overhead and requires patience from the user.   

Notes


  • I have had some good experiences with Cython and Numba to speed up Python code. But I don't think those tools are applicable here.
  • GAMS 36.0 comes with a Python-based matrix inversion. It is similar to my version and can be invoked as:
     
       $libInclude linalg -e invert i A B

    Unfortunately, it seems to be about 50% slower than my version:

    46.125   0.068GB        33 EmbeddedCode   (my version)
    73.469   0.068GB       107 EmbeddedCode   (GAMS linalg version)

    (The first column is the time in seconds). I have not analyzed this carefully but I think I do better because I don't have if statements in a tight loop. Note that the actual matrix inversion takes less than a second.
  • When you want to use this for practical input-output models, you probably want to save the inverse matrix in a GDX file or a restart file so it is readily available.
  • We could use an LP solver to invert a matrix [5]. That is more of a curiosity. 
  • There are good arguments to never invert a matrix [6]. However, for reasons of convenience and familiarity, it is easier to use an explicit inverse here. This is just how users work and how they are educated.

References

 
  1. Input-output model, https://en.wikipedia.org/wiki/Input%E2%80%93output_model
  2. Input-Output Tables for Japan, https://www.soumu.go.jp/english/dgpp_ss/data/io/index.htm
  3. Ronald E. Miller and Peter D. Blair, Input-Output Analysis, Foundations and Extensions, Cambridge University Press, 2nd edition, 2009.
  4. World Input-Output Database, http://www.wiod.org/database/wiots16
  5. Inverting a matrix by LP, http://yetanothermathprogrammingconsultant.blogspot.com/2021/04/inverting-matrix-by-lp.html
  6. John D. Cook, Don't invert that matrixhttps://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/

3 comments:

  1. Erwin, Thanks for putting this together. As usual, your post inspires us (GAMS) to improve our software. I have a few comments. I am only refering to the first part of the post (inverting a matrix).

    As a reference point I ran your code with GAMS 36.2.0 on my off-the-shelf machine with Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz and OS Windows 10 Pro 21H1:

    --- .. Copied data from i,A (3.0s)
    --- .. Inverted A (0.1s)
    --- .. Created GAMS sparse matrix B (3.1s)
    --- .. Total elapsed 6.2s

    While the times are about 7 times faster than what you report, the ratio is unchanged and not good. The gams.get() method is only one way to get data from GAMS to embedded Python code. One can also use the Python OO API which works nicely together with gams2numpy which has methods to move data in bulk between GAMS and Python. Basically you do Gams2Numpy().gmdReadSymbolRaw(gams.db,"A") this gives you a numpy array with the data from A in one swoop (which can then turned cheaply into a pandas dataframe). There are also methods to get the index as strings and obviously also ways to write back a numpy array to GAMS. This has not been well documented but with the upcoming GAMS 37 there will be more about gams2numpy and a new API called Transfer that unifies bulk data exchange across some systems. With GAMS 37 we hope to include Transfer for Python (based on gams2numpy) and Matlab/Octave. R, Julia, and others will follow. This still leaves us with the issue of remapping the GAMS indexes of i (that can have holes, does not have to start at 1, etc. I actually added a dummy set to make the GAMS indexes "more interesting") to the Python indexes of 0..card(i)-1. In your example (and also in our linalg libinclude) a map between these two is established and applied in both directions (for reading A and writing B) using some Python loop. My colleague Adam Christensen (who works on the Python port of Transfer) uses pandas category type for this mapping, no (user) loops required! There is an assumption (we check for it with an assert) that the matrix does not contain an all zero row or column, which is okay for an invert routine. So with the bulk data transfer and the categories the times come down even further:

    --- .. Copied data from i,A (0.5s)
    --- .. Inverted A (0.1s)
    --- .. Created GAMS sparse matrix B (1.3s)
    --- .. Total elapsed 1.9s

    So we can reduce the total time by a factor of 3. Still a pretty big ratio. Interesting enough with a dedicated tool like your invert.exe and GDX file communication this is still significantly faster (total time ~0.5 secs!) than all the in-memory communication of data and embedded Python code. Nevertheless, embedded Python code in GAMS gives a lot of flexibility (and access to the Python universe) and less of a dependence on GAMS to provide tools to do a specific task. The category stuff is pretty interesting and perhaps we can do part of this remapping already in gams2numpy (which uses the numpy C api and executes fast). The entire invert example redone with gams2numpy and pandas catagories is accessible here: https://cloud.gams.com/s/FYZPsXAcPfn3pfa

    ReplyDelete
    Replies
    1. Hi Michael. That looks very promising. Your link seems to be just a reformatted version of my code (no gams2numpy/pandas).

      Delete
    2. Sorry, I messed up by uploading the "reference" model. I have updated the file now on cloud.gams.com and the link https://cloud.gams.com/s/FYZPsXAcPfn3pfa should have now the version with gams2numpy/pandas.

      Delete