1

I have a field in which values look like:

Field
pos1-123|pos2 xyx123|pos3-abc|pos4x350

I want to slice the field into four different calculated fields using REGEXP_EXTRACT that look like:

  • Calculated Field 1 : pos1-123,
  • Calculated Field 2 : pos2 xyx123
  • Calculated Field 3 : pos3-abc
  • Calculated Field 4 : pos4x350

I've managed to pull Calculated Field 1 on my own by using:

> REGEXP_EXTRACT(Field, '^//|(//|[[:alnum:]]+)')

However, I'm getting stuck on iterating through the rest of the string.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Leo_Taco
  • 45
  • 10
  • Maybe `REGEXP_EXTRACT_ALL` with `([^|]+)` will do? – Wiktor Stribiżew Apr 19 '21 at 19:28
  • I think because Google Data Studio uses RE2 the REGEXP_EXTRACT_ALL function isn't allowed. – Leo_Taco Apr 19 '21 at 19:49
  • 1
    Ok, use 1) `REGEXP_EXTRACT(Field, '^([^|]+)')`, 2) `REGEXP_EXTRACT(Field, '^[^|]+\|([^|]+)')`, 3) `REGEXP_EXTRACT(Field, '^(?:[^|]+\|){2}([^|]+)')` and 4) `REGEXP_EXTRACT(Field, '^(?:[^|]+\|){3}([^|]+)')` – Wiktor Stribiżew Apr 19 '21 at 19:51
  • This is working perfectly now thanks! My only change is that in RE you need two \'s to escape a token so: ^(?:[^|]+\\|){2}([^|]+) – Leo_Taco Apr 19 '21 at 21:56

1 Answers1

1

You can use the following regular expressions:

REGEXP_EXTRACT(Field, '^([^|]+)')
REGEXP_EXTRACT(Field, '^[^|]+\\|([^|]+)')
REGEXP_EXTRACT(Field, '^(?:[^|]+\\|){2}([^|]+)')
REGEXP_EXTRACT(Field, '^(?:[^|]+\\|){3}([^|]+)')

Details:

  • ^ - start of string
  • (?:[^|]+\\|){3} - three occurrences ({3}) of
    • [^|]+ - any one or more chars other than |
    • \| - a | char
  • ([^|]+) - Capturing group 1: any one or more chars other than |.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563