0

I have a column in my database that contains campaign_name, the country codes are present in the names and I want to be able to extract them and transform them into their normal names as well as create a new column called country name that includes the country from each campaign. This is the snippet of the code:

WHEN LOWER(cmp.campaign_name) LIKE '%canada%' THEN 'Canada'
WHEN LOWER(cmp.campaign_name) LIKE '%uk%' THEN 'United Kingdom'
WHEN (cmp.campaign_name) LIKE '%US%' THEN 'United States'
ELSE 'other'
END AS country_name.

This is an example of the campaign name "campaign_UK-US_receiver" The problem with this code is that unlike a for loop in most programming languages I could create a list of country and check for the first occurrence of words that are the same as in the list, then return the country name of the respect code. The problem with using SQL is that based on the first statement which is for Canada if the first row SQL iterates over is Canada it'll get it right but if its on the next statement which is for UK. if it see's a campaign name like "campaign_Canada_UK_receiver" instead of picking Canada as its the first occurrence of the countries it'll pick UK because it has executed Canada already.

ekimebg
  • 13
  • 1
  • You might use regexp_extract and build the regex pattern to pull out the first match, and then build your case statement upon that. For example, something like CASE WHEN LOWER(REGEXP_EXTRACT(your_column, r'(uk|canada|us)')) however this regex pattern I did not test, but you achieve basically anything with regex if youre determined enough – Josh May 25 '23 at 13:14
  • 1
    Thanks for the response. I was able to use a combination of split() and offset(). Only needed to write a line of code. I previously didn't know there was a split() function in SQL, well there isn't but it exists in big query. – ekimebg May 26 '23 at 21:57

1 Answers1

0

As per @Josh, this regex pattern can be a workaround:

CASE 
WHEN LOWER(REGEXP_EXTRACT(your_column, r'(uk|canada|us)'))

You can also visit this case for a representation.

Also, @ekimebg mentioned that the workaround used was the combination of SPLIT() with the OFFSET function.

Please feel free to edit this community wiki and provide the exact workaround, confidential info redacted, for the benefit of the community that might encounter this use case in the future.

Poala Astrid
  • 1,028
  • 2
  • 10
  • Definitely, my solution is below. `SPLIT(name, '-')[OFFSET(2)] AS alias`. The second index value after the hyphen was the information I needed – ekimebg Jun 10 '23 at 14:40