I am a full-time consultant and provide services related to the design, implementation and deployment of mathematical programming, optimization and data-science applications. I also teach courses and workshops. Usually I cannot blog about projects I am doing, but there are many technical notes I'd like to share. Not in the least so I have an easy way to search and find them again myself. You can reach me at firstname.lastname@example.org.
Wednesday, May 23, 2018
Excel Power Query CSV read somewhat slow
Reading a large CSV file into Excel's data model is somewhat slow compared to R: Excel/Power Query takes 100 seconds vs R only 25 seconds on the same CSV file (with 1.8 million records).
Reading CSV into Excel/Power Query data model
I used some VBA code for the timing. The only thing we do is to load data into the data model (i.e. there is no Power Pivot table to create).
R does this quite a bit faster:
Reading the same CSV into R
I expected this to be closer. I think this operation should be IO bound instead of CPU bound, something Microsoft should know how to do super fast.