0

I have a table with below columns and data types.

id string
name string
title string
listItems string

purchase_list table:

id name title listItems
123 Peter Purchase List [{"manufacture_date":"2023-01-01","purchase_price":"20.0"}, {"manufacture_date":"2023-01-02","purchase_price":"30.0"}]

Query used:

with raw_data as
(
select 
      id
    , name
    , title
    , json_parse(listItems) as listItems
from purchase_list
)
select 
      id
    , name
    , title
    , json_extract_scalar(listItems,'$.manufacture_date') as manufacture_date
    , json_extract_scalar(listItems,'$.purchase_price') as purchase_price
from raw_data

Expected output:

id name title manufacture_date purchase_price
123 Peter Purchase List 2023-01-01 20.0
123 Peter Purchase List 2023-01-02 30.0

When I extract data using json_extract_scalar, I am getting blank values

id name title listItems
123 Peter Purchase List

Please advise how I can achieve the expected output.

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
vvazza
  • 421
  • 7
  • 21

1 Answers1

0

json_extract_scalar works with scalar values, arrays are not scalar values and your paths do not account for array in the JSON.

One way to handle this is to cast your JSON to array of JSON and unnest:

-- sample data
with dataset(listItems) as (
    values ('[{"manufacture_date":"2023-01-01","purchase_price":"20.0"}, {"manufacture_date":"2023-01-02","purchase_price":"30.0"}]')
)

-- query
select json_extract_scalar(item, '$.manufacture_date') as manufacture_date
    , json_extract_scalar(item, '$.purchase_price') as purchase_price
from dataset,
unnest(cast(json_parse(listItems) as array(json))) as t(item);

Output:

manufacture_date purchase_price
2023-01-01 20.0
2023-01-02 30.0

Alternatively you can dabble with Trino's json_query:

-- query 
select json_query(
                 listItems,
                 'lax $[*].manufacture_date'
                 WITH ARRAY WRAPPER
                ) AS dates,
    json_query(
                 listItems,
                 'lax $[*].purchase_price'
                 WITH ARRAY WRAPPER
                ) AS prices
from dataset;

Output:

dates prices
["2023-01-01","2023-01-02"] ["20.0","30.0"]

Which can also be processed with unnest (i.e. unnest(json_query(...), json_query(...)) as t(manufacture_date, purchase_price))

Guru Stron
  • 102,774
  • 10
  • 95
  • 132