1

I have a jsonl file I've read into duckdb that looks like this:

('conversation_id', 'BIGINT', 'YES', None, None, None)
('text', 'VARCHAR', 'YES', None, None, None)
('meta', 'STRUCT(case_id VARCHAR, start_times DOUBLE[], stop_times DOUBLE[], speaker_type VARCHAR, side UBIGINT, "timestamp" DOUBLE)', 'YES', None, None, None)
('reply_to', 'VARCHAR', 'YES', None, None, None)
('speaker', 'VARCHAR', 'YES', None, None, None)

The elements of the STRUCT are arbitrary in any given jsonl file I'm importing, but they're identical in every row, so I'd like to unpack the STRUCT, create a column for each element, and then get rid of the STRUCT.

I've seen a number of answers that variously cover casting the STRUCT or unpacking it in specific domains like pyspark or bigquery, but I don't understand how to do it in vanilla SQL.

AutomaticStatic
  • 1,661
  • 3
  • 21
  • 42

1 Answers1

1

You can use the struct.* syntax for this, eg

SELECT a.* FROM (SELECT {'x':1, 'y':2, 'z':3} as a);

returns

x y z
1 2 3

For more information: https://duckdb.org/docs/sql/data_types/struct#struct

Mause
  • 461
  • 4
  • 9