In (1) I provided a GAMS/VBS script to generate an Excel spreadsheet containing a pivot table that used a CSV file as external data source. Essentially the pivot table provides a view of the data in the CSV file. Here I show a script that allows to store the data inside the Excel data model: no external CSV file is used after loading the data into the data model.
$ontext |
The steps are as follows:
- Load data into GAMS.
- Create a 4 dimensional cube parameter suited for display as a pivot table. The dimensions are commodity, country, year, attribute.
- Export the cube as GDX file.
- Convert the GDX file to a CSV file.
- Run a VBscript script that talks to Excel and performs the following steps:
- Load the data into the data model.
- Create a pivot table for this data.
- Create an initial layout that makes sense for a user of the data (opposed to an empty pivot table).
The result will look like:
We see three regions. On the left we see the actual pivot table:
The most right panel allows to change the layout of the pivot table:
Note that our 4 dimensional cube parameter has become a 5 column table: the values are a separate column.
The smaller panel in the middle shows the queries in data model:
Data model
The step to load the CSV data into the data model of Excel is actually written in what is called the M language:
- Source = Csv.Document(File.Contents("cube.csv"),[Delimiter=","])
- #"Headers" = Table.PromoteHeaders(Source)
- #"Type" = Table.TransformColumnTypes(#"Headers",{{"Val", Number.Type}}) in #"Type"
Interestingly we can follow these steps after the fact in the Excel Query Editor. After the first step Source we see the raw CSV data has been imported:
The second step promotes the headers:
and the last step changes the type of the value column to numeric:
File Sizes
The method in (1) requires both an .xlsx file and a .csv file to be present. For this example, we end up with:
02/18/2017 12:49 PM 6,397,317 pivotcube.xlsx 02/18/2017 12:48 PM 62,767,308 cube.csv |
The method shown here produces a self-contained .xslx file. We have:
02/18/2017 12:52 PM 2,958,573 pivotcube.xlsx |
I am not sure why this version is even smaller than the pivot table with external CSV data, but apparently the internal data model can be squeezed very efficiently. Some of it may have to do with the internal Vertipaq Compression Engine which allows for very efficient column storage of the data in the data model (3).
References
- http://yetanothermathprogrammingconsultant.blogspot.com/2016/04/export-large-result-sets-to-pivot-table.html contains a script to generate a pivot table with a CSV file as an external data source.
- http://yetanothermathprogrammingconsultant.blogspot.com/2016/07/excel-pivot-table-and-large-data-sets.html shows how to load CSV data into the data model and create a pivot table by hand.
- Alberto Ferrari and Marco Russo, The VertiPaq Engine in DAX, https://www.microsoftpressstore.com/articles/article.aspx?p=2449192&seqNum=3 (chapter from The Definitive Guide to DAX, Business intelligence with Microsoft Excel, SQL Server Analysis Server, and Power BI, Microsoft Press, 2015).
No comments:
Post a Comment