I have annual earnings (loneink
) for 2007 for each individial ID (personlopnr
), and I want to calculate for each company ID (peorglopnr
) how much of the total wage bill was paid to worker born after 1980 (birth year is fodelsear
), in aggregate. However, the code below produces a share
column which is 0 for a vast majority of the cases and 1 for the rest. (To be clear, the code with WHERE loneink > 0
as below produces only 1s — the zero comeback without that condition and having a NULLIF
to make sure I never divide by zero.) While there are many firms with no young workers, it is clearly not the case that all the other firms are young-only.
What is wrong here? This wasn't the way to generate a "young-wage" variable where for older workers earnings are zero, so the sum is only for the young? Or in theory this is OK, but I got the CASE WHEN
wrong? Or the SUM
/SUM
misbehaves with GROUP BY
?
What is a better way to do this?
CREATE VIEW sys.over26_2007 (personlopnr,peorglopnr,loneink,below26_loneink) AS (
SELECT personlopnr,peorglopnr,loneink, CASE WHEN fodelsear < 1981 THEN 0 ELSE loneink END AS below26_loneink
FROM sys.ds_chocker_lev_lisaindivid_2007 WHERE loneink > 0
);
SELECT COUNT(*) FROM over26_2007;
CREATE VIEW sys.share26_2007 (peorglopnr,share26_2007) AS (
SELECT peorglopnr, SUM(below26_loneink)/SUM(loneink)
FROM sys.over26_2007
WHERE loneink > 0
GROUP BY peorglopnr
);
My actual use case is in MonetDB, so hopefully we can stick to SQL:2003 solutions only, no mySQL or Oracle extensions.