I have two tables:
T1
key code1 code2 code3
1 A A A
2 B B G
3 A B C
4 C C C
5 D E F
6 E E E
7 A D G
8 G G G
T2
code class1 class2 class3
A 1 0 0
B 0 1 0
C 0 1 0
D 1 1 0
E 0 0 1
F 0 1 0
G 1 0 0
I want to write a query like...
create table T3 as
select key, case
when code1 in (select code from T2 where class1 = 1) or
code2 in (select code from T2 where class1 = 1) or
code3 in (select code from T2 where class1 = 1)
then 1 else 0
end as class1,
case
when code1 in (select code from T2 where class2 = 1) or
code2 in (select code from T2 where class2 = 1) or
code3 in (select code from T2 where class2 = 1)
then 1 else 0
end as class2,
case
when code1 in (select code from T2 where class3 = 1) or
code2 in (select code from T2 where class3 = 1) or
code3 in (select code from T2 where class3 = 1)
then 1 else 0
end as class3
from T1
It is basically saying for each key in T1, look to see if the corresponding code in T2 has a 1 for each class. If so, then the new column is 1.
The problem is, I am using a version of HiveQL that doesn't support subqueries in conditional statements like this. Is there an alternative method to achieve the same result set? I was thinking something like joining, but I'm not sure of the best way to go about that in this case due to T2 not having a corresponding key.
For reference, the result set would be
T3
key class1 class2 class3
1 1 0 0
2 1 1 0
3 1 1 0
4 0 1 0
5 1 1 1
6 0 0 1
7 1 1 0
8 1 0 0