0

I have created a temp table where I have pulled in fields (that have multiple values that represent an attribute, now I want to create a logic where I compare these attributes and create a new field to summarize the ref_type and post_campaign field.

I am trying to create a new column (x) based on below logic/conditions:

> > if post_campaign starts with KNC-% and ref_type = 3 then create a new
column (x) with with field PS 
> > if post_campaign is null and ref_type = 3, then create a new column (x) with field OS 
> > if post_campaign starts with SNP-%, then create a new column (x) with field Pso 
> > if post_campaign starts with SNO-% and ref_type = 9, then create a new  column (x) with field OPso
> > if ref_type=6 then create a new column (x) with field Dir

I have created the temp table code, but need help on how do I insert the above logic in the sql query

create table temp.Register
Select date(date_time) as date, post_evar10, count(page_event) as Pageviews, concat(post_visid_high, post_visid_low) as UniqueVisitors, ref_type as Source_Traffic, paid_search, post_campaign
from a_hits
where ref_type in (3,6,7,9)
and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
and page_event like '0'
and exclude_hit like '0'
and hit_source not in (5,7,8,9)
group by Date, post_evar10, UniqueVisitors, Source_Traffic, paid_search;

expected result will be a new column where I will see:

Date    Post_evar10 Pageviews   UniqueVisitors  Source_Traffic  post_campaign   Column X
2/2/2019    event-summary   540 200 3   KNC-%   PS
2/2/2019    event-summary   300 150 3   Null    OS
2/3/2019    event-summary   230 100 9   SNO-%   Opso
2/4/2019    event-summary   290 150 9   SNP-%   Pso
2/5/2019    event-summary   100 300 6   Misc    Dir
NewCode
  • 109
  • 1
  • 8

1 Answers1

0

Assuming you are using the newest version of sparksql, you can utilize a CASE...WHEN statement

Learn more about CASE...WHEN here

create table temp.Register

Select 
    date(date_time) as the_date, 
    post_evar10, 
    count(page_event) as Pageviews, 
    concat(post_visid_high, post_visid_low) as UniqueVisitors, 
    ref_type as Source_Traffic, 
    paid_search, 
    post_campaign,
    CASE
        WHEN post_campaign LIKE 'KNC-%' AND ref_type = 3 THEN 'PS'
        WHEN post_campaign IS NULL AND ref_type = 3 THEN 'OS'
        WHEN post_campaign LIKE 'SNP-%' THEN 'PSO'
        WHEN post_campaign LIKE 'SNO-%' AND ref_type = 9 THEN 'Opso'
        WHEN ref_type = 6 THEN 'Dir'
    ELSE NULL END AS Column_X
from 
    a_hits

where 
    ref_type in (3,6,7,9)
    and ((post_evar10 like '%event-summary%') or (post_evar10 like 'registration-') or (post_evar10 like '%InformationPage%') or (post_evar10 like '%GuestRegInfo%') or (post_evar10 like '%GuestReg%') or post_evar10 like '%MyRegistration%'))
    and page_event like '0'
    and exclude_hit like '0'
    and hit_source not in (5,7,8,9)

group by 
    the_Date, 
    post_evar10, 
    UniqueVisitors, 
    Source_Traffic, 
    paid_search
;
artemis
  • 6,857
  • 11
  • 46
  • 99