In this blog post I am going to determine which item workload uses the least amount of Capacity Units when copying the same data from an On-Premises SQL Server.

The item workloads that I can use to copy data are Dataflow Gen1, Dataflow Gen2 and Pipelines.

Setup for Dataflow Gen1

Below is how I configured my Dataflow Gen1 to get the data from my “Fact Sale” table in my On-premises SQL Server.

A screenshot of a computer

Description automatically generated

Setup for Dataflow Gen 2

Below is how I configured my Dataflow Gen2 to get the data from my “Fact Sale” table in my On-premises SQL Server.

A screenshot of a computer

Description automatically generated

Setup for Pipeline

Below is how I configured my Pipeline to get the data from my “Fact Sale” table in my On-premises SQL Server.

A screenshot of a computer

Description automatically generated

Comparing Capacity Units

To get an accurate comparison, I ran the workloads multiple times. I did not run the pipeline again as both instances had the same capacity units.

Dataflow Gen1 took an average of 78 CUs

Dataflow Gen2 took an average of 550 CUs

Pipeline took an average of 360 CUs

Summary

While I did test Dataflow Gen1 I think it is not 100% fair to say that it took considerably less time than the others are that the data from the Dataflow Gen1 still needs to be loaded into a Semantic Model, which would then consume additional Capacity Units.

As shown above, the Pipeline uses the least number of Capacity Units.

Please always make sure that you do your own testing as this can vary on a multitude of factors. This should be a good indication of which workload could be used to save on capacity utilization.