Feeds:
Posts
Comments

Posts Tagged ‘Comma Seperated Values’

In SQL Server 2022, an enhancement came in STRING_SPLIT() function, which was a long awaited functionality. This enhancement provides a new column name ordinal, which has row number for each string split by this function.

Let me create a sample to demonstrate the functionality of new ordinal column in STRING_SPLIT() function as shown below.

Sample:

USE tempdb
GO
   
CREATE TABLE #Employee
(
 EmployeeID     INT IDENTITY(1,1),
 EmployeeName   VARCHAR(500),
 EmailAddresses VARCHAR(500)
);
GO
   
INSERT INTO #Employee(EmployeeName, EmailAddresses)
VALUES
('John', 'John_1@gmail.com;John_2@gmail.com;John_3@hotmail.com'),
('Sarah', 'Sarah_1@gmail.com;Sarah_2@hotmail.com;Sarah_3@gmail.com'),
('Aaron', 'Aaron_1@gmail@com;Aaron_2@hotmail.com'),
('Ackerman', 'Ackerman_1@gmail.com;Ackerman_2@hotmail.com');
GO

SELECT * FROM #Employee;
GO
--OUTPUT

Example 1:

In the below example, I will split the string in email addresses column based on a separator as usual. Also, I will pass the third argument (which came in SQL 2022 as an ehancement) as 1 in STRING_SPLIT() function which means that STRING_SPLIT() will not only split the string but also provide a serial number (ordinal column) against each split string as shown below.

USE tempdb
GO

SELECT EmployeeID
     , EmployeeName
 , value AS EmailAddress
 , Ordinal
FROM   #Employee
CROSS APPLY STRING_SPLIT(EmailAddresses, ';', 1);
GO
--OUTPUT

Example 2:

In this example, I will show you how important is this new column ordinal and how we used to struggle in earlier version of SQL Server (Earlier than 2022) to mimic the same functionality with the help of common table expression & other functions.

Let me grab the first two email addresses for each employee using old and new approach. Both approaches will return the same result but new approach is simple & efficient.

  • Old Approach: (Using ROW_NUMBER() & common table expression)


USE tempdb
GO
 
;WITH CTE AS
(SELECT EmployeeID
      , EmployeeName
      , value AS EmailAddress
      , ROW_NUMBER() OVER(PARTITION BY EmployeeID 
						  ORDER BY EmployeeID ASC) AS Ordinal
FROM    #Employee
CROSS APPLY STRING_SPLIT(EmailAddresses, ';'))
SELECT * FROM CTE WHERE Ordinal<3;
GO
--OUTPUT

  • New Approach: (Using ordinal column)

USE tempdb
GO
 
SELECT EmployeeID
     , EmployeeName
     , value AS EmailAddress
	 , Ordinal
FROM   #Employee
CROSS APPLY STRING_SPLIT (EmailAddresses, ';', 1)
WHERE ordinal<3;
GO
--OUTPUT

Conclusion:

I found the new enhancement of STRING_SPLIT() function very useful, earlier we used common table expression & ROW_NUMBER() functions to find the row number of each split string but now we can easily achieve with the help of STRING_SPLIT() ordinal column. Do let me know if you use ordinal column and how did you find it.

Read Full Post »

Splitting string in SQL Server was always challenging, we need to create a complex user defined function, using multiple SQL Server built in functions in order to achieve it. I have written an article back in 2012 regarding this. In SQL Server 2016, splitting string became so easy, SQL Server introduced a function to split string namely STRING_SPLIT(), it requires the compatibility level to be at least 130.

Let me create a sample to demonstrate STRING_SPLIT() functionality. In given below sample, I have multiple email addresses separated by a semicolon delimiter against each employee as shown below.

Sample :

USE tempdb
GO
  
CREATE TABLE #Employee
(
 EmployeeID     INT IDENTITY(1,1),
 EmployeeName   VARCHAR(100),
 EmailAddresses VARCHAR(500)
);
GO
  
INSERT INTO #Employee(EmployeeName, EmailAddresses)
VALUES
('John', 'John_1@gmail.com;John_2@gmail.com;John_3@hotmail.com'),
('Sarah', 'Sarah_1@gmail.com;Sarah_2@hotmail.com;Sarah_1@gmail.com'),
('Aaron', 'aaron_1@gmail@com;aaron_2@hotmail.com'),
('Ackerman', 'ackerman_1@gmail.com;ackerman_2@hotmail.com');
GO
  
SELECT * FROM #Employee
GO
--OUTPUT

Now sample is ready, let’s apply STRING_SPLIT() function to see how it works and how easy it is to implement it.

Example 1:

Given below is an example where I used sample created above and split email addresses based on delimiter using STRING_SPLIT() function.

USE tempdb
GO
   
SELECT EmployeeID
     , EmployeeName
     , value AS EmailAddress 
FROM  #Employee
CROSS APPLY 
STRING_SPLIT(EmailAddresses,';')
GO
--OUTPUT

In above records, we have successfully split email addresses and now we have one email address per row. In the above records, though we have successfully split email addresses, still we have duplicate email (Sarah_1@gmail.com) in email address column.

Let’s fix duplicate issue in below example.

Example 2: Removal of duplicate data

In order to remove duplicate email address, we need to use GROUP BY function along with STRING_SPLIT() function in the given below example.

USE tempdb
GO
  
SELECT EmployeeID
     , EmployeeName
     , value AS EmailAddress
FROM  #Employee
CROSS APPLY STRING_SPLIT(EmailAddresses,';')
GROUP BY EmployeeID 
       , EmployeeName
       , value;
GO
--OUTPUT

As you can see above, duplication in email address column has been removed.

Conclusion:

I found STRING_SPLIT() function very useful and we can easily split string without even writing a lot of scripts. You can also implement it wherever it is required without any complexity. Do let me know if you face any issues.

Read Full Post »