0

I run PostgreSQL and I have multiple (10+) tables with identical column names and formats and I would like to sum two of their columns. If a row is missing in one of the tables, I need it to be considered as 0 (zero).

Two-table EXAMPLE:

TableA:

primkey | year | month | day | data_value
--------+------+-------+-----+-----------
20120101| 2012 |  01   |  01 |    3.7
20120102| 2012 |  01   |  02 |    1.0

TableB:

primkey | year | month | day | data_value
--------+------+-------+-----+-----------
20120101| 2012 |  01   |  01 |    1.3
20120103| 2012 |  01   |  03 |    5.5

Wanted output:

primkey | year | month | day | data_value
--------+------+-------+-----+-----------
20120101| 2012 |  01   |  01 |    5.0
20120102| 2012 |  01   |  02 |    1.0
20120103| 2012 |  01   |  03 |    5.5

I was thinking of doing this:

SELECT a.primkey, a.year, a.month, a.day, a.data_value+b.data_valueas data_value FROM "TableA" as a FULL JOIN "TableB" as b ON a.primkey = b.primkey;

But it produces this (which is missing some of the data_values):

primkey | year | month | day | data_value
--------+------+-------+-----+-----------
20120101| 2012 |  01   |  01 |    5.0
20120102| 2012 |  01   |  02 |    
20120103| 2012 |  01   |  03 |   

Note: primkey is unique and a PRIMARY KEY and this needs to be valid for joining 10+ tables

TimY
  • 5,256
  • 5
  • 44
  • 57
  • FYI: there is a "date" datatype available. And a timestamp. And a datetime. – wildplasser Feb 24 '13 at 21:17
  • Thank you. I had actually considered it before. But the reason it's in that format is because the data is constantly transferred between Excel, PSQL and Python, and, for my specific purposes, it is substantially easier to keep it as an integer. – TimY Feb 24 '13 at 21:20

3 Answers3

2
SELECT COALESCE(a.primkey, b.primkey) AS primkey
  , COALESCE(a.year, b.year)          AS year
  , COALESCE(a.month, b.month)        AS month
  , COALESCE(a.day, b.day)            AS day
  , COALESCE(a.data_value,0) + COALESCE( b.data_value, 0) AS data_value
FROM "TableA" AS a
FULL JOIN "TableB" AS b ON a.primkey = b.primkey
   ;
wildplasser
  • 43,142
  • 8
  • 66
  • 109
1

Union all the tables and then group by the primary key. The sum of the value will be the result you are looking for.

Abraham
  • 479
  • 7
  • 23
1

Following Abraham's advice, here's the actual code:

SELECT x.primkey, x.year, x.month, x.day
   , sum(data_value) AS data_value 
FROM 
  (SELECT * FROM "TableA" 
   UNION ALL 
   select * FROM "TableB") 
AS x 
GROUP BY primkey, year, month, day;
Abraham
  • 479
  • 7
  • 23
TimY
  • 5,256
  • 5
  • 44
  • 57