So I have a dataframe called Swine_flu_cases
that looks as follows (just an extract):
Country Date Confirmed
1 Canada 2020-01-22 1
2 Egypt 2020-01-23 1
3 Algeria 2020-01-24 1
4 France 2020-01-25 1
5 Zambia 2020-01-26 1
6 Congo 2020-01-27 1
This data set looks at the recorded amount of swine flu cases of a country on a specific date.
I have filtered my data to only show variables where the confirmed cases are 1 and have also grouped it by the different country and sorted it by ascending order of date. (I did this to get the dates that these countries each had their first cases)
I have sorted it in ascending order of date because I want to extract the first time each country had their first recorded swine flu case and store that as a vector.
I have tried doing so by using the following code :
first_case_date = as.Date(data.frame(Swine_flu_cases$Date))
This however gave me an error though.
Error in as.Date.default(data.frame(Swine_flu_cases$Date)) : do not know how to convert 'data.frame(Swine_flu_cases$Date)' to class “Date”
What I want to do is create a new variable Swine_flu_cases$days_since_first_case
which will take the stored date of each of the countries on my lists first case and subtract that from all the other dates for each country.
My knowledge of for loops is very basic but I know I need to somehow use a for loop for this. I have recently familiarised myself with the lead and lag function as well and was thinking maybe there is a way in which I could combine these two functions to create this variable?
If someone can just give me a general idea on how I could go about doing this please I would really appreciate it.