0

I have a table where the max length of a column (varchar) is 12, someone has loaded some value with a space, so rather than 'SPACE' it's 'SPACE '

I want to remove the space using a script, I was positive RTRIM or REPLACE(myValue, ' ', '') would work but LEN(myValue) shows there is still and extra character?

dwhybrow
  • 39
  • 1
  • 5

2 Answers2

1

As mentioned by a couple folks, it may not be a space. Grab a copy of ngrams8k and you use it to identify the issue. For example, here we have the text, " SPACE" with a preceding space and trailing CHAR(160) (HTML BR tag). CHAR(160) looks like a space in SSMS but isn't "trimable". For example consider this query:

DECLARE @string VARCHAR(100) = ' SPACE'+CHAR(160);
SELECT '"'+@string+'"'

Using ngrams8k you could do this:

DECLARE @string VARCHAR(100) = ' SPACE'+CHAR(160);

SELECT
  ng.position,
  ng.token,
  asciival = ASCII(ng.token)
FROM   dbo.ngrams8k(@string,1) AS ng;

Returns:

position   token   asciival
---------- ------- -----------
1                  32
2          S       83
3          P       80
4          A       65
5          C       67
6          E       69
7                  160

As you can see, the first character (position 1) is CHAR(32), that's a space. The last character (postion 7) is not a space. Knowing that CHAR(160) is the issue you could fix it like so:

SET @string = REPLACE(LTRIM(@string),CHAR(160),'')

If you are using SQL Server 2017+ you can also use TRIM which does a whole lot more than just LTRIM-and-RTRIM-ing. For example, this will remove leading and trailing tabs, spaces, carriage returns, line returns and HTML BR tags.

SET @string = SELECT TRIM(CHAR(32)+CHAR(9)+CHAR(10)+CHAR(13)+CHAR(160) FROM @string)
Alan Burstein
  • 7,770
  • 1
  • 15
  • 18
0

Odds are it is some other non-printing character, carriage return is a big one when going from between *nix and other OS. One way to tell is to use the DUMP function. So you could start with a query like:

SELECT dump(column_name)
FROM your_table
WHERE column_name LIKE 'SPACE%'

That should help you find the offending character, however, that doesn't fix your problem. Instead, I would use something like REGEXP_REPLACE:

SELECT REGEXP_REPLACE(column_name, '[^A-z]')
FROM your_table

That should take care of any non-printing characters. You may need to play with the regular expression if you expect numbers or symbols in your string. You could switch to a character class like:

SELECT REGEXP_REPLACE(column_name, '[:cntrl:]')
FROM your_table
Del
  • 1,529
  • 1
  • 9
  • 18
  • dump is an Oracle PLSQL function. The OP is using SQL Server which does not have a dump function and does not support Regular Expressions (without a CLR) – Alan Burstein Oct 03 '19 at 21:06