Wednesday, May 23, 2018

Excel Power Query CSV read somewhat slow

Reading a large CSV file into Excel's data model is somewhat slow compared to R: Excel/Power Query takes 100 seconds vs R only 25 seconds on the same CSV file (with 1.8 million records).

Reading CSV into Excel/Power Query data model
I used some VBA code for the timing. The only thing we do is to load data into the data model (i.e. there is no Power Pivot table to create).

R does this quite a bit faster:

Reading the same CSV into R
I expected this to be closer. I think this operation should be IO bound instead of CPU bound, something Microsoft should know how to do super fast.