SSIS – Fastest data flow task item for transferring data over the network
My goal was to find out which combination of data flow Source and Destinations would enable me to transfer data over a network as quickly and as efficiently as possible.
· Below is what the outcome of the tests are
· And then finally I have only detailed, based on the fastest data flow items how to configure the data flow source and destinations in order to get the maximum throughput.
Below is tests that I completed, each with what configurations I had configured to complete the test.
Just some details on how the test was performed:
· It was performed between 2 servers which were on a 10Gbit Connection
· At the time, there was nothing else running on the servers, they were both idle.
· Both the Source and Destination tables had compression enabled on the physical tables.
Data Flow Source Type |
Data Flow Source Configuration Settings |
Data Flow Destination Type |
Data flow Destination Configuration Settings |
Total Rows |
Time In Seconds |
Rows\Sec |
Network Speed during Transfer |
ADO NET Source |
Set Packet Size on ADO Connection Manager to: 32768 |
ADO NET Destination |
Set Packet Size on ADO Connection Manager to: 32768 |
10 Million |
169,828 |
58 883,11 |
14MB\sec |
Batch Size: 0 |
|||||||
ADO NET Source |
Set Packet Size on ADO Connection Manager to: 32768 |
ADO NET Destination |
Set Packet Size on ADO Connection Manager to: 32768 |
10 Million |
158,75 |
62 992,13 |
14Mb\sec |
In TSQL Query used the Hint “Option (10000)” |
ADO NET Destination Batch Size: 10 000 |
||||||
OLE DB Source |
Configured the OLE DB Source Connection Manager to use: SQL Server Native Client 11.0 |
OLE DB Destination |
Set the Commit Size for the OLE DB Destination to zero “0” |
10 Million |
21,046 |
475 149,67 |
100Mb\sec |
In TSQL Query used the Hint “Option (10000)” |
Configured the OLE DB Destination Connection Manager to use: SQL Server Native Client 11.0 |
||||||
OLE DB Source |
Configured the OLE DB Source Connection Manager to use: SQL Server Native Client 11.0 |
OLE DB Destination |
Configured the OLE DB Destination Connection Manager to use: SQL Server Native Client 11.0 |
10 Million |
20,578 |
485 955,88 |
110-115Mb\sec |
In TSQL Query used the Hint “Option (10000)” |
Set the Commit Size for the OLE DB Destination to zero “0” |
||||||
Set Packet Size on OLE DB Connection Manager to: 32767 |
Set Packet Size on OLE DB Connection Manager to: 32767 |
||||||
OLE DB Source |
Configured the OLE DB Source Connection Manager to use: Microsoft OLE DB Provider for SQL Server |
OLE DB Destination |
Configured the OLE DB Source Connection Manager to use: Microsoft OLE DB Provider for SQL Server |
10 Million |
20,031 |
499 226,20 |
120Mb\sec |
In TSQL Query used the Hint “Option (10000)” |
Set the Commit Size for the OLE DB Destination to zero “0” |
||||||
Set Packet Size on OLE DB Connection Manager to: 32767 |
Set Packet Size on OLE DB Connection Manager to: 32767 |
Observations from the above tests
· The difference between the ADO NET source and ADO NET Destinations when compared to the OLE DB Source and OLE DB Destinations is significant.
o It was just over 8 x faster.
· It was interesting to note, that the network card throughput was almost 10 x faster when changing from the ADO NET to OLE DB.
· So when moving large amounts of data, this would mean that data transfers can go from hours to minutes.
· Below is a chart to show how significant the difference is:
Configuration of OLE DB Data Flow Source and Destinations for maximum throughput over a network.
Below are the configuration settings for the final row in the table above.
1. The first part is to configure your OLE DB Source Connection.
a. Create your new OLE DB Source Connection, and then once it is open ensure that the following is selected:
i. Where it says provider, from the drop down select:
b. Then put in your Server Name and Database so that it will be shown as below:
c. Next click on All in the left hand column.
d. Scroll down near the bottom where it says Packet Size
i. Change the value from 4096 to:
1. 32767
2. NOTE: It must be the above number, if you put it any higher the connection WILL fail
ii. This is what it looks like below:
2. Next for the OLE DB Destination you will follow all the steps in Step 1, but this will just change to your destination Server.
3. Now what sometimes makes the source query faster is using the TSQL Hint: Option (Fast 10000)
a. NOTE: This sometimes makes the query faster, and other times it can also slow the query down. So test first.
b. EG:
SelectRowID,DateAdded
Fromdbo.tb_TableNamewith (nolock)
Option (Fast 10000)
4. The final part is to configure the OLE DB Destination in the data flow task.
a. Double click or right click on the OLE DB Destination and go into the Properties.
b. Then ensure that you have configured with the following as shown below:
c. NOTE: that the Maximum insert commit size is set to zero.
i. You must ensure that you have enough memoryin order to only commit the entire transaction once.
ii. If you do NOT have enough memory, when SSIS tries to commit the transaction, it will then fail.
iii. If this happens to you, then configure the Maximum insert commit size, to accommodate your memory allocation.
Update – Potential issue with the Maximum Insert Commit size (25 July 2013)
1. I recently was doing a large insert of data that was roughly 640 million rows, using the above method to get the data down as quickly as possible.
2. I ran into an issue when it then had to complete the commit, which lead me to the following findings listed below.
3. When using the Maximum Insert Commit size of zero and there is not enough memory on your server, SSIS then stores this data in the TempDB. So you are not really getting the benefit of having your data stored in memory.
4. Due to the data now being transferred from the Temp DB into your working database, this once again meant that the performance was really bad. And once again due to the data not coming from the memory buffers the performance was really slow.
5. Accompanied with this is that it was now logging ever insert into your working database. So due to the fact that it was now inserting and logging every row, my Log File drive ran out of space when the log hit 340 Gb.
6. So once it failed the entire transaction rolled back.
Lessons Learnt Are
· Only use the Maximum Insert Commit size when you are inserting maybe no more than 10 million rows at a time.
· If it spills over to the Temp DB the performance is really slow.
· Rather if it is a large data set that you are moving rather change the Maximum Insert Commit size to a value which has the best performance for the speed in which you can insert the rows.
[…] https://gqbi.wordpress.com/2013/06/19/ssis-fastest-data-flow-task-item-for-transferring-data-over-th… […]
nice article sir
Thanks
great information! thanks for sharing…..
Thanks