4

I feel this should be simple, but I've struggled to find the right terminology, please bear with me.

I have two columns, timestamp and voltages which is the array

If I do a simple

SELECT timestamp, voltages FROM table

Then I'd get a result of:

|timestamp | voltages | |1544435470 |3.7352,3.749,3.7433,3.7533| |1544435477 |3.7352,3.751,3.7452,3.7533| |1544435484 |3.7371,3.749,3.7433,3.7533| |1544435490 |3.7352,3.749,3.7452,3.7533| |1544435497 |3.7352,3.751,3.7452,3.7533| |1544435504 |3.7352,3.749,3.7452,3.7533|

But I want to split the voltages array so each element in its array is its own column.

|timestamp | v1 | v2 | v3 | v4 | |1544435470 |3.7352 |3.749 |3.7433 |3.7533| |1544435477 |3.7352 |3.751 |3.7452 |3.7533| |1544435484 |3.7371 |3.749 |3.7433 |3.7533| |1544435490 |3.7352 |3.749 |3.7452 |3.7533| |1544435497 |3.7352 |3.751 |3.7452 |3.7533| |1544435504 |3.7352 |3.749 |3.7452 |3.7533|

I know I can do this with:

SELECT timestamp, voltages[1] as v1, voltages[2] as v2 FROM table

But I'd need to be able to do this programmatically, as opposed to listing them out.

Am I missing something obvious?

Daniel Crowley
  • 496
  • 1
  • 6
  • 10
  • 1
    Daniel, SQL is statically typed. Number of result columns need to be known before query is executed, so there can be no `v1, v2, ...` (depending on actual length of the arrays). Having said that, you can still "programatically" generate SQL. But you can't have SQL alone do this for you. – Piotr Findeisen Dec 13 '18 at 08:32
  • Thanks Piotr, I was wondering that, but I thought if that was the case there might be a way to do it with a secondary query on the result of the first. Hmmm. – Daniel Crowley Dec 13 '18 at 08:53
  • Does this answer your question? [athena presto - multiple columns from long to wide](https://stackoverflow.com/questions/63142257/athena-presto-multiple-columns-from-long-to-wide) – Theo Sep 12 '20 at 07:44

1 Answers1

3

This should serve your purpose if you have arrays of fixed length.
You need to first break down each array element into it's own row. You can do this using the UNNEST operator in the following way :

SELECT timestamp, volt
FROM table
CROSS JOIN UNNEST(voltages) AS t(volt)

Using the resultant table you can pivot (convert multiple rows with the same timestamp into multiple columns) by referring to Gordon Linoff's answer for "need to convert data in multiple rows with same ID into 1 row with multiple columns".

outis
  • 75,655
  • 22
  • 151
  • 221
iammrmehul
  • 730
  • 1
  • 14
  • 35
  • Thanks for that, interesting approach. At the moment I don't know the array length upfront. Do you see this having any advantages over the last method in my question? I'd say the element addressing is clearer to understand, but I don't know if your suggested method has other benefits. – Daniel Crowley Dec 13 '18 at 09:00