I face the following problem: in a SQL table i have stored Information about persons, date and diseases.
CREATE TABLE Diseases (person varchar(80), time date, disease varchar(80))
INSERT INTO Diseases (person, time, disease) VALUES ('Harry', '2018-03-06', 'A30')
INSERT INTO Diseases (person, time, disease) VALUES ('Harry', '2017-04-06', 'C27')
INSERT INTO Diseases (person, time, disease) VALUES ('Harry', '2016-08-12', 'Z07')
INSERT INTO Diseases (person, time, disease) VALUES ('Harry', '2016-08-12', 'C27')
INSERT INTO Diseases (person, time, disease) VALUES ('Ruth', '2018-08-12', 'B02')
INSERT INTO Diseases (person, time, disease) VALUES ('Ruth', '2018-08-12', 'C02')
Now i want to aggregate the disease codes of each person into one variable:
SELECT Diseases.person,
STRING_AGG(Diseases.disease, ',') AS Diags,
Diseases.time
FROM Diseases
GROUP BY Diseases.person, Diseases.time
How do I manage to aggregate only those diseases which were recognized before and at the same date of the current case (e.g. Harry should have Z07 and C27 in the row of the 2017 case)?
I want to get a table with name, date and a variable that contains a list of aggregated diseases. How do i manage that only distinct values of the diseases are included?
I tried the following but it doesnt work:
SELECT
today.person,
today.time,
today.disease,
STRING_AGG(DISTINCT past.DISEASE, ',') AS disease_history
FROM diseases today
LEFT JOIN diseases past
ON past.person = today.person AND past.time <= today.time
GROUP BY today.person, today.time, today.disease