0

I am not getting any idea how to implement it in Hive. Please suggest the way.

Assume I have hive tables like this

Table1:
id | primary | secondary
-------------------------
 1 | A       | [B,C]
 2 | B       |  [A]
 3 | C       | [A,B]

Table2
id | primary | Heart | Liver | Kidney
-------------------------------------   
1  | A       | 1.3   | 0     | 0.9
2  | B       | 0     | 3     | 2.3
3  | C       | 0     | 9     | 4.3

Table3
id | medicine | primary 
-----------------------
1  | M1       | A
2  | M2       | A
3  | M3       | B
4  | M4       | B
5  | M1       | B

Now from these tables, if table1's primary (for example A) and secondary values (for ex: B) are present in table2 and if at least one tissue is present whose values are greater than zero (here kidney is there), then I have to print the following (sample output).

primary | direct  | secondary | indirect  | tissue_name
------------------------------------------------------
A       | M1,M2   | B         | M1,M3,M4  | kidney
B       | M1,M3,M4| A         | M1,M2     | kidney
C       | None    | B         | M1,M3,M4  | Liver, Kidney

Can anyone help me how to write the queries in Hive.

Thanks in advance.

Santhosh Tangudu
  • 759
  • 9
  • 19

1 Answers1

0

Transformed Table2 into below format:

Table2
id | primary | Tissues
-------------------------------------   
1  | A       | Heart, Kidney
2  | B       | Liver, Kidney
3  | C       | Liver, Kidney

Transformed Table3 into below format:

Table3
id | primary | drug_list
-------------------------
1  | A       | M1,M2
2  | B       | M1,M3,M4

From Table 1, I created two tables

  1. Joining Table1 with transformed Table2 and transformed Table3 on a primary column.
  2. Explode Table1 on a secondary column and Joining resultant table with transformed Table2 and transformed Table3 on exploded secondary column.

Join above two tables and find the results based on common tissue values.

Santhosh Tangudu
  • 759
  • 9
  • 19