Hortonworks HDP 2.3.0 - Hive 0.14
Table T1 ( partition on col1, no bucket, ORC )
app 120 million rows & 6GB datasize
Table T2 ( partition on col2, no bucket, ORC )
app 200 M rows & 6MB datasize
T1 left outer join on t2 ( t1.col3 = t2.col3 )
The above query is long running in the last reducer phase in both tez & mr mode. I also tried auto convert true / false & explicit mapjoin.
Still the query is running in the last reducer phase, never ending.
FYI - If data size of T2 is either 9k or 1GB, the query finishes.