0

I would like to do a LATERAL JOIN in Hive. Is there any way to support this? In essence I want to use values in the rows on the LHS as parameters to arbitrary SQL on the RHS.

Here is an example from Postgres: (Forgive my crude example):

create table lhs (
    subject_id integer,
    date_time  BIGINT );

create table events (
    subject_id  integer,
    date_time   BIGINT,
    event_val   integer );

SELECT * from lhs LEFT JOIN LATERAL ( select SUM(event_val) as val_sum, count(event_val) as ecnt from events WHERE date_time < lhs.date_time and subject_id = lhs.subject_id ) rhs1 ON true;
Jeremy
  • 682
  • 2
  • 8
  • 17

1 Answers1

0

Hive doesn't support LEFT JOIN LATERAL , use below query which is equivalent to your query.I have tested with sample data ,It is producing same result.

select subject_id,date_time,SUM(event_val) as val_sum,COUNT(event_val) as ecnt 
from (SELECT a.subject_id as subject_id ,
      a.date_time as date_time, b.date_time as bdate , b.event_val as event_val
      FROM events b LEFT OUTER JOIN lhs a 
      ON b.subject_id = a.subject_id) abc 
where bdate < date_time group by subject_id,date_time;

Hope that I will help you to formulated things how you can achieve the same in hive.

Sanjiv
  • 1,795
  • 1
  • 29
  • 45
  • Thanks.. I was hoping there was a better way.. That inner join is going to create a lot of data unless Hive is clever about it. – Jeremy Apr 05 '15 at 23:13