0

I am working on athena . I have a table with two columns of interest besides other columns. They are id and account architect column. Both are string type .

enter image description here

This is how the table looks like now .

The end product should be ;

enter image description here

The query that i have is : enter image description here

enter image description here

How should my query look like to get this end product .

user2974748
  • 61
  • 1
  • 1
  • 5

1 Answers1

0

This should solve your problem. Note Here I am using regexp_replace to remove all special characters like (comma,],[,double quotes).

select array_agg(distinct id),regexp_replace(username,'[^[:alnum:]'' '']') from 
(
select split_part("sto architect",',',1) as username,id from xyz_teams
Union 
select split_part("sto architect",',',2) as username,id from xyz_teams
) temp_tbl
group by regexp_replace(username,'[^[:alnum:]'' '']')
Deepak Kumar
  • 298
  • 1
  • 7
  • thank you very much @deepak ... the query works well indeed . pardon me for my very stupid question . I added architects here -- > SELECT array_agg(distinct id),regexp_replace(username,'[^[:alnum:]'' '']') AS architects from ( ..... ) to reflect column name as architects . I cant figure out where to add AS id to reflect the first column name as id . it now shows as _col0 . Thank you , your response has been very helpful . – user2974748 Nov 20 '19 at 13:51
  • So now after running the above code the output looks like what i mentioned above . – user2974748 Nov 20 '19 at 18:50
  • So now after running the above code the output looks like what i mentioned above .Because the actual data has account ids of varied lengths and also include characters as well , my final effort is to only show those account ids in the id column with a certain length say 6 and omit any length other than 6 and also any id which has any character in between other than a number . – user2974748 Nov 20 '19 at 19:02
  • _col0 architects 837246,tiy-973-gjdu djfgf 598428,fdjrt-gjkdf-343 , urid_jf_fi fdgdf 847392 rtwsz 598423,fcvghb,kcjgrd gndfs this is the table output i have and i want the table to show only let say account id which is of length 6 and nothing else . – user2974748 Nov 20 '19 at 19:04
  • WITH dataset AS ( SELECT id , SPLIT(REPLACE(REPLACE((SELECT"architects"FROM xyz_teams),']'), '['), ',') AS architects FROM xyz_teams) SELECT array_agg(distinct id),regexp_replace(username,'[^[:alnum:]'' '']') AS architects from ( select split_part("sto architect",',',1) as username,id from xyz_teams where length(split_part("sto architect",',',1)) = cast (6 as integer) Union select split_part("sto architect",',',2) as username,id from xyz_teams where length(split_part("sto architect",',',2)) = cast (6 as integer)) temp_tbl group by regexp_replace(username,'[^[:alnum:]'' '']') – user2974748 Nov 20 '19 at 19:11