I was recently getting an error when refreshing an Excel file that was stored in SharePoint online.

This got me thinking what the issue is and is there a way to improve it and YES there is as detailed below.

When I investigated this further, I found that my 17kb Excel file was using at least 616kb of data when being refreshed in Power BI Desktop. Whilst this is a relatively small amount this is 36x larger than the file size. This led me to believe that it is possibly being read multiple times, but some other things might be going on!

The first thing that I found that I was using the SharePoint.Files connector, and what this does is it first connects to the SharePoint folder and displays all the files.

In my case I have over 27,000 files!

Only in the next step do I then filter for the one file.

To me I thought this is a lot of files to load and to then filter.

What I then did was thanks to working together with my fellow worker (Ajay) I then used the Web.Contents connector.

I then used the Power Query Diagnostics tools to see what the effect would be.

I used the Diagnostics which can be found under Tools in the Power Query Editor

I ran the diagnostics for the SharePoint.Files connector and for the Web.Contents connector. Next, I loaded this data into Power BI Desktop to see how long each one took.

As you can see below the SharePoint.Files took 46 seconds to load the data and had 196 rows.

Comparing this to the Web.Contents this took 1 second to load and 22 rows.

That is a massive improvement.


I thought this is awesome using the Power Query Diagnostics but what happens when I refresh this in the Power BI Service, would I see similar results?

Below is the reference for the SharePoint.Files refresh in the Power BI Service where it took roughly 45 seconds to complete!

And here is the Web.Contents refresh times which took about 2 seconds to complete.

This matched the Power Query Diagnostics I had seen, which is great to see.

And this confirmed that changing to use the Web.Contents improves the refresh performance by a massive margin.

Summary

In this blog post I have shown how I increased the refresh performance for my Excel file, which made the entire process faster.

Question and comments are most welcome!