SSIS & TSQL – Geo- location from IP Address to Country
Below are the details firstly from getting your Geo – location information from a table which has your IP Address to country information.
And then getting your IP Address information from your source systems and quickly getting this to a Country so that you can see where the people are coming from.
Geo – Location information
There are a lot of Geo – Location places where you can download the IP Address ranges which then relate to countries.
NOTE: Due to IP Address ranges changing over time, it is a good idea to ensure that your IP Address Range Info table gets updated at the very least once a month.
· You can download it from the following location:
o http://geolite.maxmind.com/download/geoip/database/GeoIPCountryCSV.zip
You can use the following below to select only Valid IP Addresses
o where[IP Address]not like‘%[^.0-9]%’
Using SSIS, tables, indexes and stored procedures to get from IP address to country as quickly as possible.
Below are the additional steps to get the data from an IP Address mapped to a country?
Source Tables
Below is a list of source tables which you will require, and will be used later
1. IP Address Range Info Table
a. This is the table that has the IP Address ranges, which map to a Country (Normally a CountryID)
b. NOTE: You can create this table from the download above.
c. Once you have your table one of the biggest things is to create the index below:
CREATE CLUSTERED INDEX[IXC_Start_End_IP] ON [dbo].[tb_IPAddressRangeInfo]
(
[StartIPAddress]ASC,
[EndIPAddress]ASC
)WITH (PAD_INDEX=OFF,STATISTICS_NORECOMPUTE=OFF,SORT_IN_TEMPDB=OFF,DROP_EXISTING=OFF,ONLINE=OFF,ALLOW_ROW_LOCKS=ON,ALLOW_PAGE_LOCKS=ON,FILLFACTOR= 100,Data_Compression=Page)
GO
i. NOTE: In the above Index we created a FILL Factor of 100% and we also compressed the Index.
1. Also remember to update the index when you put new data into your IP Address Range Info Table.
2. Country Table
a. This is the table which has the mapping from CountryID to Country Name
Converting from IP Address to IP Number
Next what you will need to do is to convert from an IP Address to an IP Number.
NOTE: This is required because the IP Address Range Info table has the IP Ranges stored as IP Numbers.
1. What we did in order to make the Process quicker in terms of getting it from an IP Address to country is we first only selected the distinct IP Addresses.
a. NOTE: here it was just a simple select. Also note what we used to only get valid IP Addresses
Select Distinct([IP Address])
from staging_tb_ClientSourceInformation with (nolock)
where [IP Address] not like‘%[^.0-9]%’
Option (Fast 10000)
b. We then put this into a table called:
i. [dbo].[staging_tb_DistinctIPAddresses]
2. The next step is we now are going to Update our column called IPNumber.
a. We do this by using the following below which is in a proc to calculate the IP Number:
UPDATE
— This will be our Staging Table
dbo.[staging_tb_DistinctIPAddresses]
SET
[IPNumber]= (CAST(dbo.fn_Parse([IP Address],‘.’,1) ASBIGINT)* 16777216)+(dbo.fn_Parse([IP Address],‘.’,2)*65536)+(dbo.fn_Parse([IP Address],‘.’,3)* 256)+dbo.fn_Parse([IP Address],‘.’,4)
WHERE
— This works to ONLY get the IPAddress not like ‘%[^.0-9]%’
[IP Address] not like‘%[^.0-9]%’and
[IPNumber] ISNULL AND
[CountryStateSK]ISNULL
i. NOTE: We are using the Parse Function in order to convert each octet into our IP Number.
b. At the same time we are also updating our destination table with the IP number, so that later when we join back to get the CountryID we have a column to match on.
i. NOTE: The query is identical to above, but we are just changing the table we are updating.
3. Now this is the part that used to take the longest, where we are looking at the IP Address Range Info and then finding the CountryID and then inserting this into a new table with just the IPNumber and CountryID
a. The Table we insert into is called:
i. [dbo].[Staging_tb_DistinctIPAddressesWithCountrySK]
b. NOTE: The reason for the Insert into a table is because that is quicker than doing an update when the dataset gets large.
c. NOTE 1: This is also the section where we are using the new TSQL Syntax to find the CountryID from the IP Address info Range as quickly as possible.
d. Now here is the proc where we do this:
Select C.ID as CountryStateSK,IPNUmber
FROM
dbo.[staging_tb_DistinctIPAddresses] A
LEFTOUTERJOIN
[dbo].[tb_Country] CON
C.ID = (select CountryId from [dbo].[tb_IPAddressRangeInfo]
where StartIPAddress=(select max(StartIPAddress) from [tb_IPAddressRangeInfo] where StartIPAddress<=IPNumber)
and EndIPAddress >=IPNumber)
where
A.[IP Address] not like‘%[^.0-9]%’ AND
A.[IPNumber]IS NOT NULL
i. As you can see above the section highlighted in RED is where we are getting the CountryID from our IP Number Range
e. We take the output of this data and insert into an additional staging table.
i. As explained above using the new TSQL Syntax to find the CountryID from the IP Address info Range as quickly as possible
4. The last part is where we now use our Distinct IPNumbers and CountryID to join back to our Source Table based on the IP Number.
a. And here we then update the Source Tables CountryStateSK with the relevant details as shown below:
Update dbo.Staging_tb_ClientSourceInformation
Set CountryStateSK = I.CountryStateSK
from [dbo].[Staging_tb_DistinctIPAddressesWithCountrySK]asIwith (nolock)
Inner join dbo.Staging_tb_ClientSourceInformationasSwith (nolock)
on i.IPNumber = S.IPNumber
— The reason that this is set to 1580 is because that is the Row we inserted where we have no data
Update dbo.staging_tb_ClientSourceInformation
Set CountryStateSK = 1580
from dbo.staging_tb_ClientSourceInformationwith (nolock)
where CountryStateSK is null
i. As you can see above this is a simple update statement based on the IP Number.
Reference to the Parse Function
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
Create FUNCTION[dbo].[fn_Parse](@String VARCHAR(500),@Delimiter VARCHAR(10),@Position INTEGER)
RETURNS VARCHAR(500)
AS
–DECLARE @String VARCHAR(500)
–DECLARE @DelimiterVARCHAR(10)
–DECLARE @Position INTEGER
— SET @Delimiter = ‘|’
— SET @Position = 3
— SET @Counter = 1
BEGIN
DECLARE @Counter INTEGER
DECLARE @Return_Value VARCHAR(500)
SET @Counter= 1
WHILE @Counter<@Position
BEGIN
SET @String= RIGHT(@String,LEN(@String)-CHARINDEX(@Delimiter,@String))
SET @Counter = @Counter + 1
END
SET @Counter = CHARINDEX(@Delimiter,@String)
SET @Return_Value=
CASE
WHEN @Counter > 0 THEN LEFT(@String,@Counter–1)
ELSE @String
END
RETURN (@Return_Value)
END