I've got several million rows in a PostgreSQL 9.3 table, and I'm struggling to calculate aggregate statistics.
Here are the columns:
object_id | date | column_with_json_data_in_key-value_format | bunch_of_irrelevant_columns
There is one row for every object_id/date combo. There are approximately 70 keys in the JSON column, but any given row only contains ~5-20 keys.
Across all object_id's, for each key, what is the daily average fractional value?
- Within each row, calculate the value of a particular key divided by the sum of all values for all keys for that particular row
- If a key doesn't appear in a row, treat it as having a value of 0 for that row
- Group the results by date
- Calculate the average value for each key
So my desired output is a table with the following columns:
date | key1_name | key1_fractional_average | keyN_name | keyN_fractional_average
I tried this query for a single row, but it's not working correctly:
SELECT value2, total, (value2/total * 100) AS percent FROM
(SELECT SUM(value) AS total FROM
(SELECT CAST(CAST(value AS text) AS integer) FROM json_each((SELECT json_column
FROM original_table
WHERE primary_key = 1)::json)) as table1) as total_table,
(SELECT CAST(CAST(value AS text) AS integer) AS value2 FROM json_each((SELECT json_column
FROM original_table
WHERE primary_key = 1)::json)) as value_table
For this particular page_id/insight_date row, the query returns the following:
value2 | total | percent
------------------------
1; 4; 0
2; 4; 0
1; 4; 0
The columns value2
and total
are correct, but the percent
column incorrectly shows 0's instead of the correct 25%, 50%, 25%.
Unfortunately, because JSON is a fairly new type in Postgres, googling hasn't turned up many examples. I'm thoroughly confused here.