I am attempting to create a calculated field called "Page" using a custom CSV upload data source - a list of URLs from several websites combined into one big list of thousands of URLs on dozens of different domain names in Datastudio.
In the CSV, I have a field called "URL". It contains each pages full URL including the Root Domain Name.
I have another field for each of those records called "Root Domain Name". It has the Root Domain Name for each URL.
I'd like to extract the Root Domain Name identified in one field from the URL identified in another field, leaving just the "Page" path. The URLs vary in terms of Top Level Domain - some are .com, some are .co.uk, some are .fr, etc.
Ultimately, the output would be something like this:
- www.domain.com/test-page - > /test-page
- www.domain.co.uk/test-page - > /test-page
- www.domain.fr/test-page -> /test-page
- etc.
It seems like it'd be something like this but obviously isn't working hence my presence here today:
REGEXP_REPLACE(URL,Root Domain Name,'')
I am thinking that removing the value of one field from the value of another is one way of getting at it but there might be a better way of simply manipulating the URL field to remove everything prior to the 3rd / too.
I need to keep the first / after the Domain Name (a data formatting issue).
I'll be plugging away at this one and figured there has to be someone out there that has seen this before so welcome any input.
Have a good day everyone.