1

I have the following string:

'Siemens','Simatic','Microbox','PC','27','6ES7677AA200PA0','6ES7','677AA200PA0'

I want to remove any "terms" that are less than 5 characters. So in this case I'd like to remove 'PC', '27' and '6ES7'.

Which would result in:

'Siemens','Simatic','Microbox','6ES7677AA200PA0','677AA200PA0'

This is in SQL server and I have a function that accepts a regex command, so far it looks like this:

SELECT dbo.fn_StripCharacters(title, '/^''PC''$/')

I tried to hardcode to remove 'PC' but I think its removing all apostrophes, and 'P' and 'C' characters:

Siemens,Simati,Mirobox,,427B,6ES76477AA200A0,6ES7,6477AA200A0

This is the function I'm using:

CREATE FUNCTION [dbo].[fn_StripCharacters]
(
    @String NVARCHAR(MAX), 
    @MatchExpression VARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
    SET @MatchExpression =  '%['+@MatchExpression+']%'
    
    WHILE PatIndex(@MatchExpression, @String) > 0
        SET @String = Stuff(@String, PatIndex(@MatchExpression, @String), 1, '')
    
    RETURN @String
    
END
Dale K
  • 25,246
  • 15
  • 42
  • 71
Lee
  • 1,485
  • 2
  • 24
  • 44
  • Does the actual string have single quotes in it? – Tim Biegeleisen Aug 18 '21 at 08:41
  • Yeah! I think thats half the problem tbh – Lee Aug 18 '21 at 08:44
  • You should really move away from storing CSV like this. Instead, store each CSV value on a separate row. Then, your requirement becomes very easy. – Tim Biegeleisen Aug 18 '21 at 08:45
  • I don't have a choice unfortunately, the data comes from a 3rd party. I could replace the apostrophes with another symbol, would that make the regex easier? I'm naff with regex – Lee Aug 18 '21 at 08:48
  • _the data comes from a 3rd party_ That does not prevent you for "correcting" it during import and storing it in proper normalized tables. That is a traditional and common function of ETL. – SMor Aug 18 '21 at 11:16

1 Answers1

3

If you don't care about the particular order of the words which are retained after filtering off words 4 characters or less, you could use STRING_SPLIT and STRING_AGG:

WITH cte AS (
    SELECT id, value
    FROM yourTable
    CROSS APPLY STRING_SPLIT(val, ',')
)

SELECT id, STRING_AGG(value, ',') AS val
FROM cte
WHERE LEN(value) > 6
GROUP BY id;

Demo

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Sorry gotta say that this is amazing. I have no idea how you guys come up with this stuff. It works perfectly for my needs and also gives freedom for messing with the WHERE clause. Thanks again – Lee Aug 18 '21 at 10:04
  • @Lee After answering several thousand SQL Server questions, plus having used it for more than a decade, you pick up tricks here and there. By the way, to get to know a real SQL guru on this site, [check out Gordon Linoff's profile](https://stackoverflow.com/users/1144035/gordon-linoff). – Tim Biegeleisen Aug 18 '21 at 10:06