Sunday, December 18, 2016

MonetDB and R

logo

MonetDB (1) is a so-called column-oriented database or column-store (2). It is server based, SQL compliant and open-source. The column orientation can lead to better performance for certain types of tasks, especially OLAP and ETL (i.e. analytical work). Traditional row-wise databases are said to be more appropriate for OLTP workloads.  

MonetDBLite

There exists a CRAN package to let R talk to a MonetDB server (MonetDB.R). There is also a different package called MonetDBLite. This contains an in-process version of MonetDB.  This means MonetDBLite is more or less an alternative for RSQLite (4). A picture comparing the performance of MonetDBLite is from (4):

Basically the more towards the left-lower corner the better.

In (3) there are some timings comparing MonetDBLite to SQLite. E.g. converting a (large) table to a data frame:

There is lots of data traffic from the database to R and the server based MonetDB.R does not finish the larger tables (traffic has to go through a TCP/IP socket). But the embedded database is very fast compared to SQLite.

This looks like an interesting database for R related work.

References
  1. MonetDB The column-store pioneer, https://www.monetdb.org/Home
  2. Daniel Adabi, Distinguishing Two Major Types of Column-Stores, March 2010, http://dbmsmusings.blogspot.com/2010/03/distinguishing-two-major-types-of_29.html
  3. https://www.monetdb.org/blog/monetdblite-r
  4. Anthony Damico, MonetDBLite because fast, June 2016,  https://www.r-bloggers.com/monetdblite-because-fast/
  5. Using MonetDB[Lite] with real-world CSV files, https://www.r-bloggers.com/using-monetdblite-with-real-world-csv-files/