Tuesday, February 4, 2014

GAMS–>R: gdxrrw vs gdx2sqlite

There are at least two ways to get data from GAMS into R. Use the tool gdxrrw (http://www.gams.com/presentations/informs2012_gdxrrw.pdf) which allows reading and writing GDX files from within R or use a database representation in between. Here we use gdx2sqlite and the SQLite database (http://yetanothermathprogrammingconsultant.blogspot.com/2013/10/gdx2sqlite.html). Here are some differences:

gdxrrw gdx2sqlite
GAMS code GAMS code
sets
  j
/j1*j3/
  k
/k1*k2/
;
parameter a(j,k);
a(j,k) = uniform(0,1);


execute_unload "data.gdx";
sets
  j
/j1*j3/
  k
/k1*k2/
;
parameter a(j,k);
a(j,k) = uniform(0,1);


execute_unload "data.gdx"
;
execute "gdx2sqlite -i data.gdx -o data.db";
R Code R Code

if (!require(reshape2)) {
   install.packages("reshape2", repos="
http://cran.r-project.org")
   library(reshape2)
}

if (!require(gdxrrw)) {
   download.file("
http://support.gams.com/lib/exe/fetch.php?media=gdxrrw:gdxrrw_0.4.0.zip","gdxrrw_0.4.0.zip")
   install.packages("gdxrrw_0.4.0.zip",repos=NULL)
   library(gdxrrw)
}

igdx("")

j<-rgdx.set("data.gdx","j")
j

a<-rgdx.param("data.gdx","a")
a

a$j

if (!require(RSQLite)) {
   install.packages("RSQLite", repos="
http://cran.r-project.org")
   library(RSQLite)
}

sqlite<-dbDriver("SQLite")
db <- dbConnect(sqlite,"data.db")

j<-dbGetQuery(db,"select * from j")
j

a<-dbGetQuery(db,"select * from a")
a

a$j

Results Results

> j
   i
1 j1
2 j2
3 j3
> a
   i  j     value
1 j1 k1 0.1717471
2 j1 k2 0.8432667
3 j2 k1 0.5503754
4 j2 k2 0.3011379
5 j3 k1 0.2922121
6 j3 k2 0.2240529
> a$j
[1] k1 k2 k1 k2 k1 k2
Levels: k1 k2

> j
   j
1 j1
2 j2
3 j3
> a
   j  k     value
1 j1 k1 0.1717471
2 j1 k2 0.8432667
3 j2 k1 0.5503754
4 j2 k2 0.3011379
5 j3 k1 0.2922121
6 j3 k2 0.2240529
> a$j
[1] "j1" "j1" "j2" "j2" "j3" "j3"

The database does a much better job of getting the correct column names into R. It is possible to retrieve the correct column names with gdxrrw by inspecting the attributes of the data frame. So we could have used the following assignment to repair the column names:

names(a)<-append(attributes(a)$domains,"value")

Another difference is that columns related to GAMS indices have a different type (factor v.s. character vector). The database interface always returns character columns as such, without translation into factors (even if getOption("stringsAsFactors") is true, which is actually the default).

R allows downloading and installing packages without specifying the version number: you get automatically the latest one. Unfortunately for gdxrrw you need to specify a complete version.

Update: looks like gdxrrw 0.4.0’s behavior has changed. I no longer see i,j,k,… as default column names but i1,i2,i3,…