5

I have two table as follows:

Table A

userid | code | code_name | property_id
0001   | 1    | apple_id  | Y1234
0031   | 4    | mango_id  | G4567
0008   | 3    | grape_id  | H1209
00013  | 2    | peach_id  | Z5643

Table 2

apple_id | mango_id | grape_id | peach_id | new_id
Y1234    |  R1890   |          |          | N456098
         | G4567    |          |  B3490   | N002345
T3336    |          | H1209    |  F3467   | N129087
         |  D7865   | J6543    |  Z5643   | N109876

Desired Resultant table

userid | new_id
0001   | N456098
0031   | N002345
0008   | N129087
00013  | N109876

Using the code_name in Table A, I would like to find the respective property_id from Table A in Table B. Basically, match on the column name in Table B. The aim is to get the corresponding new_id.

Apple, mango, grape and peach ids can be the same. However, new_id values will be unique.

Is this possible in Hive? There does not seem to be any unpivot/pivot functionality in Hive.

Any help would be really great. Thanks!

activelearner
  • 7,055
  • 20
  • 53
  • 94

2 Answers2

20

Whenever I want to pivot a table in Hive, I collect key:value pairs to a map and then reference each key in the next level, creating new columns. This is the opposite of that.

Query:

select a.userid, y.new_id
from (
  select new_id, fruit_name, fruit_code
  from (
    select new_id, map("apple_id", apple_id
                     , "mango_id", mango_id
                     , "grape_id", grape_id
                     , "peach_id", peach_id) as fruit_map
    from table_2 ) x
  lateral view explode(fruit_map) exptbl1 as fruit_name, fruit_code ) y
join table_A a
on (y.fruit_code=a.property_id)

Output:

0001    N456098
0031    N002345
0008    N129087
00013   N109876
o-90
  • 17,045
  • 10
  • 39
  • 63
  • 1
    This should have been marked as the correct answer. Excellent demonstration of map and explode for these kind of problems. It is a much better approach compared to using UNIONS, especially in hive when dealing with large map-reduce jobs coming out of the SQLs. – NG Algo Jun 27 '16 at 23:52
  • 1
    How would this be extended such that you don't have to manually write all the different levels (i.e. apple, mango, grape, peach). It'd be ideal if something like `select distinct code_name` works as the input to create new columns – Anonymous May 18 '20 at 09:33
0

You actually dont need to unpivot to get the result.

SELECT * FROM 
(
    SELECT userid,
        CASE WHEN property_id = apple_id THEN new_id  
             WHEN property_id = mango_id THEN new_id  
             WHEN property_id = grape_id THEN new_id  
             WHEN property_id = peach_id THEN new_id
        END AS newid
    FROM
    (SELECT * FROM a FULL JOIN b) X
) y
WHERE newid IS NOT NULL;

Result:

y.userid  y.newid
1         N456098
31        N002345
8         N129087
13        N109876
DavidMWilliams
  • 834
  • 2
  • 14
  • 22
  • Is this answer down-voted because it's computationally more expensive than the explod funcitons? – Zafar Apr 22 '19 at 17:45