How to enable the Power BI On-Premise Data Gateway to Stream data immediately
Below is a quick blog post on how to stream the data immediately when using the On-Premise Data Gateway.
Currently as far as I understand it the On-Premise Data Gateway will wait and buffer some data before sending it through to the Power BI Service. By changing the setting below in the On-Premise Data Gateway, it will start streaming the data almost immediately.
I am fortunate enough to be really good mates with Phil Seamark who so part of the Power BI CAT team and he gave me a little nugget of gold that I would like to share with you.
To enable this, I did the following below:
-
I went to the folder where I had my On-Premise Data Gateway installed
- The default location is here: C:\Program Files\On-premises data gateway
-
I always recommend making a backup copy of anything before making a change.
-
I made a copy of the config file below before making the change.
Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config
- I know it is not a great name, but I know I have a working copy before I made any change.
-
- I then opened the file Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config
-
I then went to the section called StreamBeforeRequestCompletes
- This is what it looked like in its default state
-
I then made the change and set it to True as shown below.
- I then saved and closed the file.
-
My final step was to restart the On-Premise Data Gateway.
I then tested running the refreshing of the data and it started loading the data almost instantly (I did notice that it did depend on the underlying data source)
Conclusion
This quick blog post has shown how I have enabled the On-Premise Data Gateway to start streaming data almost immediately.
As always, any comments or suggestions are most welcome.
Thanks for reading!
Would this explain why schedule refresh takes longer than on-demand?
Hi Samuel, thanks for the comment.
This might be the case, but I think when refreshing online via the schedule it is often close to the time, but not AT the actual time.
I have found in my experience than when using the On-Demand refresh or scheduled refresh in terms of how long it takes to refresh the data it is roughly the same time.
[…] Gilbert Quevauvilliers doesn’t have time to wait: […]
Why would you do that? I’m confused about this. Streaming datasets don’t use gateway, so… Is this making an import mode refresh faster? or is this good for direct query purpose?
I don’t see why would you change that.
Hi there
Thanks for the comment, in some situations this will allow for the dataset to be refreshed quicker. It all depends on your situation!
[…] property on the gateway. It’s documented here and a few people (see here and here for example) have already blogged about how much this has helped […]
Man, we are having a weird issue. When we set this setting to true, save the file and then restart the gateway we aren’t seeing any change in the behavior. The dataflow is still filling up the disk when we refresh it. Any experience with the gateway not honoring these config changes?
Hi there, thanks for the comment.
In the past I did see it working after the restart. I am not sure if it is because of the dataflow?