How to configure Incremental Refresh with Power Query in Power BI
Following on from my successful blog post How you can incrementally refresh any Power BI data source (This example is a CSV File), I found a way where I can just use dates created in Power Query to get data refreshing incrementally.
Full credit goes to Rafael Mendonç who actually figured this out. All that I have done is to translate what Rafael Mendonça did in his PBIX and put it into steps that you can follow along with.
https://www.rafaelmendonca.com/2020/06/incremental-powerbi-csv-api-excel-odbc.html
In this blog post I am going to demonstrate how to get this working with what I hope is very easy to follow. As well as you can download the PBIX at the end of the blog post, which you can use as your template going forward.
In my example I am going to have a start date of 01 June 2020 (which is when my data will start from)
I am going to initially configure the RangeStart and RangeEnd Parameters to be 01 June 2020 to 03 June 2020
Once I have completed the configurations and successfully refreshed the dataset, I will have dates and partitions for the past 20 days
Configuring the Parameters
- I go into Power Query Editor
-
I configure StartDate Parameter to be 2020-06-01
- NOTE: Change this start date in your data to be when your data begins.
-
I then configure my RangeStart to be 2020-06-01
- A quick note even though I did not put in the time Power Query is smart enough to put those in for me.
-
I then configure my RangeEnd to be 2020-06-03
- A quick note even though I did not put in the time Power Query is smart enough to put those in for me.
- I verified that my parameters were as expected
Configuration of my Data table
Following on from above is where I have my data table, which is where I configure the RangeStart and RangeEnd Parameters
-
As you can see below, I configured my column to use the RangeStart and RangeEnd Parameters
- And as shown above I can see the 2 dates I expect to be based on my parameters.
-
As with my previous blog post, I could then add in additional columns in this table which will give me the required date column which I can join to any existing dataset.
- This will then allow my dataset to be refreshed incrementally.
-
One additional step I had to configure to get the incremental refresh to work is I added another table which has an actual data source.
- This table gets an Exchange Rate from a Website
- I had to create this table. Without any actual data sources in my dataset Power BI actually has nothing to refresh.
- I then loaded the dataset.
-
This is what it looked like before the incremental refreshing
Testing the Incremental Refreshing
The steps below are where I configured and tested the incremental refreshing to make sure it works as expected.
-
I configured my data table with the following incremental refreshing policy for the post 20 days and to refresh the past 2 days
- I then uploaded my dataset to the Power BI Service (Again using Power BI Premium so I could double check the refreshing and partitions!)
-
Next, I completed the first dataset refresh which should create the partitions.
- I verified this using SSMS and I could see the last 20 days worth of partitions
- Now was the final step to see it it will only refresh the last 2 days worth of partitions. I went and refreshed the dataset again.
-
I refreshed the partitions in SSMS, and I could now see the last 2 days of partitions were refreshed.
-
Finally, I could see the table has been updated and incrementally refreshed.
Summary
I must admit for me personally this is even better than my previous blog post, because there is no need to create, maintain and look after an Azure SQL Server database and table.
The pattern to achieve this with your dataset can be summarized as the following:
- Copy the parameters and data table.
- Edit the StartDate parameter to match the start date of your data.
- Create a column in the data table which has a date or date time which you can use with your data.
- Join your dataset to the data table using a function or a merge
- Load the dataset
- Configure the Incremental refreshing.
- Upload your PBIX, refresh once, then schedule the refresh and you now have your incremental refresh working.
I hope that this has been useful and if you have got any questions please leave them in the comments section below.
Here is a link to the PBIX: Incremental Refresh with Power Query.pbix
This is a good summary – the official documentation of it can be confusing to follow.
Note that your parameters should be the same format as the field you will be using – either date or datetime.
If you’re using a different data source then you also might have to reformat the date, example teradata likes either xyyMMdd or ‘yyyy-mm-dd’ so you’ll likely need to do DateTime.From(StartRange, “1yyMMdd”), or create a function to convert it for you.
It’s a similar process to set up incremental refresh in dataflows in the PowerBI service too.
Hi Ian,
Thanks for the comments and you are totally right.
I decided to not go into too much detail but rather show the pattern. Which will then allow others to plug their data into the dataset.
[…] Gilbert Quevauvilliers has a follow-up from a post: […]
Great work Gilbert!
Thanks Doug
incremental refresh only works with power bi premium? or also works with power bi pro?
Hi there
Incremental refreshing works with Pro and Premium!
Gilbert, thanks for the article. In SAP BW, the date field it’s text type “dd.MM.yyyy”, and actually query folding it’s supported. How can I make inc. refresh work in this scenario? I’m not able to filter >= or < in a text field, and transforming the field into "datetime" type breaks query folding. Any help will be greatly appreciated. Thanks!
Hi Dan,
Thanks for the comment.
And what I would suggest doing is to change the data type for your column to DateTime in the Power Query Editor. As long as the previous steps fold, which they will after that the data will only get back the extra days based on the incremental refreshing.
Thanks for the example. Knowing how to test this is a great help. I’m uncertain of one part, though.
“Next, I completed the first dataset refresh which should create the partitions.
I verified this using SSMS and I could see the last 20 days worth of partitions”
Where did you get the AS connection string from that Power BI is using?
Hi Andrew,
Thanks for the comment.
This can only be done if you are using Power BI Premium or Power BI Premium per user and then you can connect via SSMS?
This is great – thanks for sharing! I hadn’t come across using SMSS to verify that the partitions have been created, and are updating as per the configuration – That’s super cool! Did you look into maybe automating some reporting from this? It would be great to have this as part of the DevOps reporting…. Thoughts?
Hi Robbie,
Thanks for the kind comment!
I have not looked to see how to automate this, but it could be done via SSIS or Azure Data Factory and then save the results into Azure Blob Storage?