0

Suppose the PostgreSQL table structure looks like the following:

 Time | Name | Value
  w   |   d  |   0
  x   |   a  |   1
  x   |   b  |   2
  y   |   c  |   3
  y   |   b  |   4
  z   |   c  |   5
  z   |   a  |   6
  z   |   d  |   7

I have the need of displaying the data as follows:

 Time | Name A | Name B | Name C |
  x   |    1   |    2   |        |
  y   |        |    4   |    3   |
  z   |    6   |        |    5   |

The number of Name variables is undefined and unknown prior to the query and the only filtering that is done is by Time. The table's row count goes into the millions and time is crucial. In other words, the user, at request time, is able to dynamically select the Name variables he/she wants to see displayed on the result as well as the Time range to filter by.

The approach I currently have implemented generates a dynamic query in real-time as the request comes in that looks like the following:

WITH SubQuery AS( 
  SELECT "Name", "Time", "Value" 
  FROM "Table" 
  WHERE "Name" IN ('a','b','c') AND "Time" BETWEEN 'x' AND 'z' 
  ORDER BY "Time" ASC 
) 
SELECT ("MasterTime") as "Time", "Name A","Name B","Name C"
FROM ( 
  SELECT "Time" as "MasterTime" 
  FROM SubQuery 
) AS "TimeData" FULL OUTER JOIN ( 
SELECT "Time" as "Time'A'", "Value" as "Name A" 
FROM SubQuery WHERE "Name" = 'a' 
) AS "Data'A'" ON "MasterTime"="Time'A'" FULL OUTER JOIN ( 
SELECT "Time" as "Time'B'", "Value" as "Name B" 
FROM SubQuery WHERE "Name" = 'b' 
) AS "Data'B'" ON "MasterTime"="Time'B'" FULL OUTER JOIN ( 
SELECT "Time" as "Time'C'", "Value" as "Name C" 
FROM SubQuery WHERE "Name" = 'c' 
) AS "Data'C'" ON "MasterTime"="Time'C'"
ORDER BY "MasterTime" ASC

I was thinking this may be a common issue in the world of data basing and there probably is a better solution out there. All suggestions are welcome.

RodCardenas
  • 585
  • 6
  • 14
  • You are asking for a dynamic pivot query. Nothing wrong with that, but you should check to make sure that millions of records won't generate too many columns, either for Postgres to support or for your result set to be usable – Tim Biegeleisen May 01 '17 at 15:36

2 Answers2

1

SQL cannot grow the number of columns after the query has been parsed and prepared. You can't make more columns appear based on the distinct values it discovers during execution. This means you need to hard-code the columns when you design the query.

SELECT "Time",
  COALESCE(MAX(CASE "Name" WHEN 'a' THEN "Value" END), '') AS "Name A",
  COALESCE(MAX(CASE "Name" WHEN 'b' THEN "Value" END), '') AS "Name B",
  COALESCE(MAX(CASE "Name" WHEN 'c' THEN "Value" END), '') AS "Name C"
FROM "Table" 
WHERE "Name" IN ('a','b','c') AND "Time" BETWEEN 'x' AND 'z' 
GROUP BY "Time"
ORDER BY "Time" ASC;

The CTE subquery is unnecessary and doesn't help.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • In my application's data flow, I create a query string that "hard codes" the values needed for the operation once the user has chosen the time filter and data fields of interest. The back-end then queries the database with the built string. Sorry if the wording of my question led to think otherwise. Thank you! This is by far faster than what I had. – RodCardenas May 01 '17 at 15:54
  • For a PostgreSQL specific solution, you might also look at the CROSSTAB function documented here: https://www.postgresql.org/docs/current/static/tablefunc.html I have not used it, so I don't know how good at performance it is. Anyway, my solution is standard SQL and should run on any brand of SQL database. – Bill Karwin May 01 '17 at 16:07
0

Another way of doing the problem is

SELECT time, a as name_a, b as name_b, c as name_c
  FROM
(
  SELECT time,name,value
    FROM table1
) s
PIVOT
(
  MAX(Value) FOR name IN (a,b,c)
) p
G.Arima
  • 1,171
  • 1
  • 6
  • 13
  • The OP tagged their question [tag:postgresql] and PostgreSQL doesn't support the PIVOT keyword. As far as I know, that's a nonstandard, Microsoft-specific solution. – Bill Karwin May 01 '17 at 16:08