0

I'm trying to transpose a variable in Hive such as:

Id1  Id2 Event
 1    1   7
 2    2   3
 2    2   7

to

 Id1  Id2 Event_7 Event_3
  1    1   1
  2    2   1        1

Following is what I have so far:

 create temporary table event_trans as 
           select Id1, Id2,Event
           kv['3'] as Event_3,
           kv['7'] as Event_7
           from(
             select Id1, Id2, collect(Event, '1') as kv
             from event1
             group by Id1, Id2

             )t

Error: Error while compiling statement: FAILED: ParseException line 1:84 missing EOF at '[' near 'kv'

I'm also interested to know how to transpose a dataset in Hive with duplicates such as to the same output:

Id1  Id2 Event
 1    1   7
 2    2   3
 2    2   7
 2    2   7

to

 Id1  Id2 Event_7 Event_3
  1    1   1
  2    2   1        1

Appreciate for any help!

lydias
  • 841
  • 1
  • 14
  • 32

1 Answers1

1

In Hive SQL, you can do conditional aggregation:

select 
    id1,
    id2,
    max(case when event = 7 then 1 end) event_7,
    max(case when event = 3 then 1 end) event_3
group by id1, id2
order by id1, id2
GMB
  • 216,147
  • 25
  • 84
  • 135
  • if I have more than 2 categories (not just 7 and 3) under Event will this procedure be too time consuming? – lydias Jan 15 '20 at 17:45