1

I have a column named "impact" which has data in nested Json format

input: [{"internalid":"079","impactid":[{"position":"1","typeid":"NOEUD","value":"G1"},{"position":"2","typeid":"ID","value":"001"},{"position":"3","typeid":"CODE_CI","value":"14"}],"typeid":"BTS","cdrs":"X110","belong":"OF","impactclass":"R","count":"0","numberaccessimpacted":"0","impactcalculationrequest":null},{"internalid":"6381075","impactid":[{"position":"1","typeid":"NOEUD","value":"G3"},{"position":"2","typeid":"ID","value":"003"},{"position":"3","typeid":"CI","value":"58"}],"typeid":"BTS","cdrs":"X110","belong":"OF","impactclass":"R","count":"0","numberaccessimpacted":"0","impactcalculationrequest":null},{"internalid":"6381071","impactid":[{"position":"1","typeid":"NOEUD","value":"G2"},{"position":"2","typeid":"IDT","value":"002"},{"position":"3","typeid":"CI","value":"57"}],"typeid":"BTS","cdrs":"X110","belong":"OF","impactclass":"R","count":"0","numberaccessimpacted":"0","impactcalculationrequest":null}]

I use the code below:

SELECT 
       get_json_object(single_json_table.identifiant, '$.position') AS impact_position,
       get_json_object(single_json_table.identifiant, '$.value')  AS impact_value
   FROM 
   (SELECT exp2.identifiant
  FROM  socle s
  lateral view explode(split(regexp_replace(substr(impact, 2, length(impact)-2),
          '},\\{"', '},,,,{"'), ',,,,')) exp2 as identifiant   
           )single_json_table 

Here is the results, it skips the first position and value, does anyone know how can i fix it please ?

impact_position  |  impact_value
(null)                (null)
2                     001
3                     14
(null)                (null)
2                     003
3                     58
(null)                (null)
2                     002
3                     57 
leftjoin
  • 36,950
  • 8
  • 57
  • 116
Stella
  • 69
  • 1
  • 10

1 Answers1

1

Your input is a JSON with nested arrays. Upper level array is the whole input, contains struct<internalid : string, impactid : array < struct <> > >, impactid is a nested array, containing struct elements like this: {"position":"1","typeid":"NOEUD","value":"G1"}

You need to explode both arrays. First explode upper array: change delimiters, split, explode, then do the same with nested array.

Demo:

with socle as (
select '[{"internalid":"079","impactid":[{"position":"1","typeid":"NOEUD","value":"G1"},{"position":"2","typeid":"ID","value":"001"},{"position":"3","typeid":"CODE_CI","value":"14"}],"typeid":"BTS","cdrs":"X110","belong":"OF","impactclass":"R","count":"0","numberaccessimpacted":"0","impactcalculationrequest":null},{"internalid":"6381075","impactid":[{"position":"1","typeid":"NOEUD","value":"G3"},{"position":"2","typeid":"ID","value":"003"},{"position":"3","typeid":"CI","value":"58"}],"typeid":"BTS","cdrs":"X110","belong":"OF","impactclass":"R","count":"0","numberaccessimpacted":"0","impactcalculationrequest":null},{"internalid":"6381071","impactid":[{"position":"1","typeid":"NOEUD","value":"G2"},{"position":"2","typeid":"IDT","value":"002"},{"position":"3","typeid":"CI","value":"57"}],"typeid":"BTS","cdrs":"X110","belong":"OF","impactclass":"R","count":"0","numberaccessimpacted":"0","impactcalculationrequest":null}]'
as impact

)

select internalid,
       get_json_object(e.impact, '$.position')  as position,
       get_json_object(e.impact, '$.value')  as value
from
(
select get_json_object(impacts, '$.internalid') internalid,
      --extract inner impact array, remove [], convert delimiters 
       regexp_replace(regexp_replace(get_json_object(impacts,'$.impactid'),'^\\[|\\]$',''),'\\},\\{','},,,,{') impact
from 
(
SELECT --First we need to explode upper array. Since it is a string, 
       --we need to prepare delimiters to be able to explode it
       --remove first [ and last ], replace delimiters between inner structure with 4 commas
       regexp_replace(regexp_replace(s.impact,'^\\[|\\]$',''),'\\},\\{"internalid"','},,,,{"internalid"') upper_array_str 
  FROM  socle s
)s lateral view explode (split(upper_array_str, ',,,,')) e as impacts --get upper array element
)s lateral view explode (split(impact, ',,,,') ) e as impact

Result:

internalid  position    value
079         1          G1
079         2          001
079         3          14
6381075     1          G3
6381075     2          003
6381075     3          58
6381071     1          G2
6381071     2          002
6381071     3          57
leftjoin
  • 36,950
  • 8
  • 57
  • 116
  • sorry to bother you @leftjoin, I have a little problem with that solution, so I have a bunch of other variable that i need to get, along with what in the json column, one of them is the id and that column don't come out properly, there is a lot of rows missing. could you please help me ? – Stella Dec 08 '20 at 13:48
  • @Stella okay. Describe your problem in details. If this is not related, better ask new question – leftjoin Dec 08 '20 at 13:58
  • @lefjoin so as I said, i have a column names "t_id" of type string that i need to get, along with the column "impact" of type JSON, so if i want to retrive the t_id: "20W064" in the first select : `SELECT t_id, regexp_replace(regexp_replace(s.impacts_impact,'^\\[|\\]$',''),'\\},\\{"internalid"','},,,,{"internalid"') upper_array_str FROM socle s where t_id = "20W064"` it works but when i use lateral view explode, no result is fetched for that id. I don't know if the problem is with lateral view or something else. Can you please help ? – Stella Dec 08 '20 at 16:43
  • 1
    @Stella Lateral view works similar like inner join. If lateral view does not return exploded rows because array is empty, then no rows returned from main table. Use `LATERAL VIEW OUTER...` instead of `LATERAL VIEW` - then in case of empty or absent array, it will return rows from main table (like LEFT JOIN). If this does not help w your problem, then pls prepare some representative data example and query – leftjoin Dec 08 '20 at 16:50
  • @Stella Read this for better understanding https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews – leftjoin Dec 08 '20 at 16:54
  • 1
    thank you so much!!, in fact, it was just the lateral view that i had to replace with lateral view outer. – Stella Dec 08 '20 at 19:57