How to refresh Excel files faster in Power BI
I was recently getting an error when refreshing an Excel file that was stored in SharePoint online.
This got me thinking what the issue is and is there a way to improve it and YES there is as detailed below.
When I investigated this further, I found that my 17kb Excel file was using at least 616kb of data when being refreshed in Power BI Desktop. Whilst this is a relatively small amount this is 36x larger than the file size. This led me to believe that it is possibly being read multiple times, but some other things might be going on!
The first thing that I found that I was using the SharePoint.Files connector, and what this does is it first connects to the SharePoint folder and displays all the files.
In my case I have over 27,000 files!
Only in the next step do I then filter for the one file.
To me I thought this is a lot of files to load and to then filter.
What I then did was thanks to working together with my fellow worker (Ajay) I then used the Web.Contents connector.
I then used the Power Query Diagnostics tools to see what the effect would be.
I used the Diagnostics which can be found under Tools in the Power Query Editor
I ran the diagnostics for the SharePoint.Files connector and for the Web.Contents connector. Next, I loaded this data into Power BI Desktop to see how long each one took.
As you can see below the SharePoint.Files took 46 seconds to load the data and had 196 rows.
Comparing this to the Web.Contents this took 1 second to load and 22 rows.
That is a massive improvement.
I thought this is awesome using the Power Query Diagnostics but what happens when I refresh this in the Power BI Service, would I see similar results?
Below is the reference for the SharePoint.Files refresh in the Power BI Service where it took roughly 45 seconds to complete!
And here is the Web.Contents refresh times which took about 2 seconds to complete.
This matched the Power Query Diagnostics I had seen, which is great to see.
And this confirmed that changing to use the Web.Contents improves the refresh performance by a massive margin.
Summary
In this blog post I have shown how I increased the refresh performance for my Excel file, which made the entire process faster.
Question and comments are most welcome!
Hi,
That is certainly some interesting time differences.
I used to use Web.Contents but on some projects found it to be a pain on the admin side when you had to deal with credentials in the Power BI Service, e.g. the Data source credentials needed to be managed for every Web.Contents connection, whereas a connection to a SharePoint site only has one instance to manage (assuming one site).
This is when I started to play with Table.Buffer to improve the SharePoint connection performance. This can make significant improvements to the performance although it seems to be more art (guesswork) than science to get this right. Trying to add the Table.Buffer in different stages/steps to gain performance can be very time consuming.
Have you had any experience using Table.Buffer to improve SharePoint connection performance?
Thanks for the comment Kim.
I have also had different experiences with Table.Buffer and where and when it works. And sometimes I have found it to make things run even slower.
As you mentioned it takes a bit of testing to see where it could potentially be faster.
That is why this method works for me because it is faster and simple.
Wonder if the same difference is seen in SharePoint.Folder as well as SharePoint.Files
The only way would be to test it out!
Wonderful information! Thank you for sharing
Thanks Ajay
I’ve found that using SharePoint.Contents is significantly faster and ensures the credentials are all maintained within the organisation. Is there a way for you to add it to your test cases?
Hi Shaz,
That is really interesting I will see when I can have a look (No promises that it will be in the next week)
Any comparison with sharepoint.content where you navigate through folders? (already faster than sharepoint.files)
Hi Romain,
Thanks for the comment, and people have mentioned SharePoint.Contents which is also very fast. I have not had time yet to see if this will be faster.
Hi Gilbert,
can you clarify to me how did you measured this? –> “When I investigated this further, I found that my 17kb Excel file was using at least 616kb of data when being refreshed in Power BI Desktop”
Hi Riccardo,
Thanks for the comment!
What I meant by this is I knew that my file size was 17KB, but when the data was being loaded in Power Query I could see on the bottom left hand side that it said loading 616KB which indicated to me it was reading the file multiple times.
Now I got you! thank you!