0

I am parsing a JSON field in Redshift but the values are 6 levels deep. AWS says JSON_EXTRACT_PATH_TEXT has a 5 level limitation. I briefly tried a cte but am going down a rabbit hole as SQL and JSON are new to me. Alternatively I am wondering if the best place to do this is in dbt. If so I could see if that rabbit hole is less deep.

Code:

{
  "overall": {
    "WE": {
      "A": {
        "WE: one": "1",
        "WE: two": "5",
        "Work: three": "1"
}}
}

Expected Results:

WE  name value
--  ---- -----
A   one   1
A   two   5
A   three 1
michanne
  • 21
  • 1
  • 4
  • I don't have AWS redshift so I'll nudge you the right direction. Can you nest expressions? I had to change JSON_VALUE(field, '$.one.two.three.four.five.six') into JSON_VALUE(JSON_VALUE(field, '$.one.two.three.four.five') , '$.six') in order to dodge the depth limitation in Microsoft TSQL... – user922020 Jan 27 '20 at 19:03

1 Answers1

0

First of all, your input is not valid JSON ... (eg: you have 4 open { but only three }). Second of all, your input/output pair looks like you're trying to do more than just unnesting. For example, your nested key "WE:one" gets parsed into "name": "one". Third of all: given your expected output, you should have an array of records somewhere:

{
  "overall": {
    "WE": {
      "A": [
        {"WE: one": "1"},
        {"WE: two": "5"},
        {"Work: three": "1"}
      ]
    }
  }
}

Therefore you won't solve your issue with only unnesting.

Having said that, you can get started with one of Redshift's JSON functions. From the Redshift docs:

JSON_EXTRACT_PATH_TEXT returns the value for the key:value pair referenced by a series of path elements in a JSON string.

So in your case this would look like:

select 
    json_extract_path_text('{"overall": {"WE": {"A": {"WE: one": "1", "WE: two": "5", "Work: three": "1" }} }','overall', 'WE', 'A')
;

And then you can refer to Return elements of Redshift JSON array on separate rows to explode the arrays.

louis_guitton
  • 5,105
  • 1
  • 31
  • 33