There are two ways to do this, and I'm not sure which would be faster in Redshift. One is union all
and group by
:
select sid, sum(totalcorrect) as totalcorrect, sum(totalquestions) as totalquestions
from ((select sid, totalcorrect, totalquestions
from t1
) union all
(select sid, totalcorrect, totalquestions
from t2
)
) t
group by sid;
The second uses full join
, for which I recommend using the using
clause:
select sid,
coalesce(t1.totalcorrect, 0) + coalesce(t2.totalcorrect, 0) as totalcorrect,
coalesce(t1.totalquestions, 0) + coalesce(t2.totalquestions, 0) as totalquestions
from t1 full join
t2
using (sid);
There are differences between these two approaches. The first guarantees one row per sid
in the result set, even if there are duplicates in one of the tables. The first also combines NULL
values of sid
into a single row.