0

I'm using REGEXP_EXTRACT to get the last path of the Page Dimension in Google Data Studio. The Page URL has one identifier which is the same on all possible URLs:

+------------------------------------------------------+
|                         Page                         |
+------------------------------------------------------+
| /ABC/something1/something2/something3/lastpath       |
| /ABC/something1/something2/something3/last path      |
| /ABC/something1/something2/something3/last-path      |
| /ABC/something1/something2/something3/last last-path |
+------------------------------------------------------+

So I want to extract the last path behind /something3/, all the values in the last directory.

Here is what I got:

REGEXP_EXTRACT(Page,'/ABC/([^/]+/){3}') 

But this matches only the /something3/ directory.

Ardent Coder
  • 3,777
  • 9
  • 27
  • 53
elitebook0711
  • 13
  • 1
  • 5

2 Answers2

2

Idea is to use capturing groups only for something you want extract. For everything else use the non-capturing (?:REGEX).

SELECT REGEXP_EXTRACT(
    '/ABC/something1/something2/something3/last last-path',
    '/ABC/(?:[^/]+/){3}(.+)');
Gramotei
  • 1,524
  • 14
  • 19
  • 2
    I think this will only work when there are exactly 3 'somethings'. – tomb Jul 16 '19 at 11:25
  • Ghramotei's, is good if, you need it to happen 3 sections after /ABC/. However, if you want it to happen for all 5th sections then: '^(?:[^\\/]*[\\/]){5}([^\\/]*)'. – Aaron C Jul 17 '20 at 22:35
2

To capture "all the values in the last directory" (all characters after the last /), the following REGEXP_EXTRACT Calculated Field does the trick:

REGEXP_EXTRACT(Page, "([^/]*)$")

Google Data Studio Report as well as a GIF to elaborate:

Nimantha
  • 6,405
  • 6
  • 28
  • 69