1

I need to write an SQL text for replacing only entire words.

For "whole word" I mean all occurrences which are not part of another alphanumeric word.

Example:

Replacing 'abc' with 'xyz'

| MYFIELD |
| 'test abc test' |
| 'abctest' |
| 'testabctest' |
| 'testabc' |
| 'abc test' |
| 'test abc' |
| 'abc' |
| ' abc ' |

Expected result:

| MYFIELD |
| 'test xyz test' |
| 'abctest' |
| 'testabctest' |
| 'testabc' |
| 'xyz test' |
| 'test xyz' |
| 'xyz' |
| ' xyz ' |

I've prepared an example using a common regular expression engine.

I've tried porting the same RegEx in Firebird SQL, using SIMILAR TO, but it didn't worked as expected:

UPDATE mytable
SET mytable = REPLACE(myfield, 'abc', 'xyz')
WHERE myfield SIMILAR TO '\babc\b'
Fabrizio
  • 7,603
  • 6
  • 44
  • 104
  • `TRIM(REPLACE(' ' || myfield || ' ', ' abc ', ' xyz '))` ANSI SQL. Don't know if Firebird supports it. – jarlh May 17 '19 at 07:52
  • @jarlh Firebird supports TRIM and REPLACE, but this approach has two caveats: 1) it assumes space is the only possible separator, not any other char, and 2) if the line starts or ends with spaces - those would be deleted as a result. If those caveats are no problem for TS then it should work – Arioch 'The May 17 '19 at 08:41
  • @Fabrizio - you SQL+RegEx approach is wrong anyway, think about the line like " abc 111abc111" - the first `abc` (and only the first) would make `WHERE` work, but then BOTH `abc` would get replaced. Perhaps you should do `MERGE` command, calculating new string for EVERY row, and then comparing old and new ones and then updating only those lines where new `IS DISTINCT FROM` old – Arioch 'The May 17 '19 at 08:48
  • [SQL regex syntax](https://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-commons-predicates.html#fblangref25-commons-buildregex) used by `SIMILAR TO` doesn't support word boundaries, and if it did, the syntax you used would be incomplete as a `SIMILAR TO` needs to match the full string. You will either need to find a suitable UDF or UDR or build your own UDF or stored function to do this. – Mark Rotteveel May 17 '19 at 09:45
  • @Fabrizio . . . I think your fundamental issue is storing multiple values in a single string. Your "words" should probably be split out into separate rows in another table. – Gordon Linoff May 17 '19 at 16:45

1 Answers1

1

Assuming that your "words" are defined by spaces, then you can do:

UPDATE mytable
    SET myfield = TRIM(REPLACE(' ' || myfield || ' ', ' abc ', ' xyz '))
WHERE myfield LIKE '% abc %';

This adds a space at the beginning and end of the string, does the replace, and then removes the spaces.

Fabrizio
  • 7,603
  • 6
  • 44
  • 104
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • TRIM will remove all trailing and leading spaces, not only the ones added for the REPLACE – Fabrizio May 17 '19 at 13:23
  • I repeat the same: what would happen with line like " abc 111abc111" ? One should make a PSQL function replacing text and then use `MERGE` command – Arioch 'The May 17 '19 at 13:42
  • @Fabrizio . . . You have leading and trailing spaces in such a field? – Gordon Linoff May 17 '19 at 14:28
  • @Arioch'The: You're right, the code I've posted in the question would fail in that case, replacing both "abc" occurrences. But Gordon's code will replace only the first occurrence, am I wrong? – Fabrizio May 17 '19 at 15:02
  • @GordonLinoff: Yes, some records could have trailing and leading spaces (I've updated the question by adding the case with trailing and leading space) – Fabrizio May 17 '19 at 15:03
  • 1
    @Fabrizio You could replace the trim with a `substring` that that removes both spaces again (eg `substring(replace(' ' || myfield || ' ', ' abc ', ' xyz ') from 2 for char_length(myfield))`) – Mark Rotteveel May 17 '19 at 15:27
  • @Fabrizio and that I already covered in my first comment up there about delimeters :-) see a string like "zzz abc, xxx" - no substitution would happen. And I am not even starting with "Abc", "ABC" and maybe "aBc" :-D / So I stand by my idea, that the most general framework would be making custom-tailored PSQL function, passing every string from it and then using `MERGE` for replacing where it is worth it – Arioch 'The May 17 '19 at 16:23
  • @MarkRotteveel `....for char_length(result_string) - 2)` - or you are making fragile undocumented assumption that no substitution would change the string length ever – Arioch 'The May 17 '19 at 16:25
  • `... = reverse( substring( reverse( substring( replace(....) from 2) ) from 2) )` :-D – Arioch 'The May 17 '19 at 16:31
  • @Arioch'The Substring will work with the code as shown, but you're right it is fragile with changes that change the length. It would still be possible, but then you should probably be better of with doing it in a (procedural) function. – Mark Rotteveel May 17 '19 at 16:44
  • @MarkRotteveel it would not "still be possible" for a simple reason - you can not know the per-row delta-length because you can not know if the replacement pattern was meet 1 time in the string, or 2 times or 3 times, etc. Per-substitute delta alone is not enough, and multiplier is unknown. Yeah, I am nitpicking here a lot, just the work with legacy projects full of "this works now" undocumented implicit assumptions makes one's skin thin on such issues. – Arioch 'The May 17 '19 at 16:48
  • @Arioch'The In a stored function, you could guard against certain problems, and it would be simpler to chop off the first and last space based on the actual string length. – Mark Rotteveel May 17 '19 at 16:50