Wednesday, April 13, 2016

CSV file too large to view or import with Excel

Excel has a hard limit of 1 million rows. This limit also holds for the 64 bit version of Excel: it is more related to the way row numbers are stored than running out of memory. When you try to open a CSV file that is larger, Excel will truncate the data:

image

One way of viewing such a file is to keep the data as a CSV file, and use a Pivot Table with an external data source. Usually Pivot Tables use data stored in a different sheet. But it is also possible to have the raw data stored in an external database. A CSV is file is just a database table when we use ODBC with the Text Driver. This way we can pivot the table (e.g. years as columns) and select slice (e.g. we likely want to see one commodity and one attribute type at the time).

image 

With the script we make sure the initial layout make sense, i.e. it has something useful to show (Coffee production in Antarctica is likely to be non-existent so that would be bad example to start with).

image

In the sheet with the pivot table we have at the top two drop down menus where we can specify which attribute and commodity to select.

image

I use this technique quite often to view large data sets.