1

I want to create a table "table_min_date_100d_per_country" which contains the first date where the cumulation by date of COVID cases exceeds 100 per country.

I have the columns date, cas_covid, country.

Sample data is..

Date              Cas_covid          country
2019-12-31            10             France
2020-01-01            15             France
2020-01-02            45             France       
2020-01-03            5              France
2020-01-04            15             France
2020-01-05            11             France

The output is

2020-01-05    COVID cases = 101      country = France

Thanks.

JavaTechnical
  • 8,846
  • 8
  • 61
  • 97
Chafik Ahl
  • 11
  • 2

1 Answers1

0

If you are using SAS, it is much easier to get the cumulative sum with data step. There is no direct way of doing so with proc sql. Assuming your data is called "old_data" and it is already sorted by country and date, the following code will create a new dataset with the cumulative sum ("cum_sum") variable, by country:

data temp_data;
set old_data;
by country;
if first.country then cum_sum=0;
cum_sum+Cas_covid;
run;

After calculating the cumulative sum by country, you can get your desired output with proc sql, if you prefer, by evaluating only the results of cum_sum over 99 and keeping only the minimum for every country, like:

proc sql;
create table table_min_date_100d_per_country as
select distinct
date,
cum_sum as COVID_cases,
country
from temp_data
group by country /*This line gets you summarizing statistics by country*/
where cum_sum >= 100 /*This line says that you only evaluate results >= 100*/
having COVID_cases = min(COVID_cases) /*Within the end table, you only keep the minimum number of covid cases per country (after preselecting above 99)*/;
quit;

If your data is not sorted, you should first run

proc sort data=old_data;
by country date;

Best regards,

LuizZ
  • 945
  • 2
  • 11
  • 23