0

im in the process on transferiing data from oracle to hive with thiveinput talend.

My code looks like this.

SELECT 
DISTINCT A.ID,
  LEVEL SEQUENCE,
  REGEXP_SUBSTR(A.ANEST,'[^|]+', 1, LEVEL),
  DATE
FROM
( SELECT A.* 
FROM tableaa A,
   tablebb B
  WHERE A.IDX = B.IDY
  and A.DATE = B.DATE
  ) A
CONNECT BY INSTR(A.ANEST, '|', 1, LEVEL-1) > 0
AND PRIOR sys_guid() IS NOT NULL

Would u mind to explain in the simple way, what connect by instr stands for? And how should I write in hive?

thank you

1 Answers1

0

"CONNECT BY" is a way of performing a recursive lookup, like data_id and parent_id in a row, where parent_id would point to a prior data_id in another row. This allows Oracle to rapidly construct hierarchical relationship trees and the like.

There is no native equivalent for this in Hive.

I did see one blog post for creating an external function to support something similar in Hive, which you could check out here: https://blog.pythian.com/recursion-in-hive/

pmdba
  • 6,457
  • 2
  • 6
  • 16