I had to migrate data from old system to new system and I found that we have one field called ‘notes’ having duplicated comma separated IDs. The next stage was to remove the duplicated (IDs) from string (notes column).
We can develop this solution via while Loop but I developed it without Loop.
Given below is the solution that can remove duplicate entry from comma, semi colon or any other delimited string .
Create Function dbo.[UDF_Remove_Duplicate_Entry] ( @Duplicate_String VARCHAR(MAX), @delimiter VARCHAR(2) ) RETURNS VARCHAR(MAX) BEGIN DECLARE @Xml XML DECLARE @Removed_Duplicate_String VARCHAR(Max) SET @Duplicate_String=REPLACE(@Duplicate_String,'&','And') SET @delimiter=REPLACE(@delimiter,'&','And') SET @Xml = CAST(('<A>'+REPLACE(@Duplicate_String,@delimiter,'</A><A>')+'</A>') AS XML) ;WITH CTE AS (SELECT A.value('.', 'varchar(max)') AS [Column] FROM @Xml.nodes('A') AS FN(A)) SELECT @Removed_Duplicate_String =Stuff((SELECT '' + @delimiter + '' + [Column] FROM CTE GROUP BY [column] FOR XML PATH('') ),1,1,'') SET @Removed_Duplicate_String=REPLACE(@Removed_Duplicate_String,'And','&') RETURN (@Removed_Duplicate_String) END GO --For Example : SELECT dbo.[UDF_Remove_Duplicate_Entry] ('1,2,2,3,4,1,1,2,3,5',',') AS [Comma Delimited] GO SELECT dbo.[UDF_Remove_Duplicate_Entry] ('1;2;2;3;4;1;1;2;3;5',';') AS [Semi Colon Delimited] GO --OUTPUT
Comma Delimited
——————
1,2,3,4,5
(1 row(s) affected)
Semi Colon Delimited
——————
1;2;3;4;5
(1 row(s) affected)
Hi Thanks for this solution, how would I pass on a value from a record instead of a value?
Hi,
Instead of ‘1,2,2,3,4,1,1,2,3,5’ you need to put field name and it will remove duplicates from your fields.
Let me know if it will not work for you then I will create a sample.
Thanks
Imran
Thank you it works. Now I need to figure out how to create the query to update each row. I tried this below for one record but it seems to take its record’s value and apply to all records.
UPDATE EDDSDBO.Document
SET Custodian2 = (SELECT dbo.[UDF_Remove_Duplicate_Entry] (CWCustodians,’;’) AS [Semi Colon Delimited]
FROM EDDSDBO.Document
WHERE ArtifactID = 1046614)
Disregard – I figured it out..
UPDATE EDDSDBO.Document
SET Custodian2 = dbo.[UDF_Remove_Duplicate_Entry] (CWCustodians,’;’)
Welcome.
Yes,You are right.
Thanks,
Imran
Good usage of CTE Muhammad but the XML has problem with some characters ( “&”, “”). You could bring your UDF to a next level by escaping them.
SELECT dbo.[UDF_Remove_Duplicate_Entry] (‘test|test|a&b’,’|’) is failing with an XML parsing error.
Hi Francois,
Thank you for your feedback. I resolved this bug from the stored procedure.
Kindly test it again.
Imran
Thanks.
Works fine.