2

I have the following two datasets:-

  1. Monthly water quality data
  2. Rainfall (daily observations @hourly timestamp)

The first water quality dataset contains mostly concentrations of contaminants

>wq_data

Date         TSS       TZn       TCu
2/02/1995    16.0      0.02      0.006
9/03/1995    10.0      0.03      0.005
7/04/1995     8.2      0.04      0.004
10/05/1995    4.3      0.04      0.006

Rainfall data is hourly data for a number of days.

>Data_ppt

Date               Rain
1/02/1995 01:00    0.0
1/02/1995 02:00    1.87
1/02/1995 03:00    0.0
1/02/1995 04:00    0.0
1/02/1995 05:00    0.0
.....
2/03/1995 01:00    0.0

I am trying to extract data from Data_ppt based on dates in wq_data. I understand this can be done using a number of techniques such as mentioned in this Q&A but I have to go one step ahead and extract Data_ppt data, 1 or 5 day prior to the record dates in wq_data.

I want a new data set that looks like this. (Taking the case of 1 day prior to wq_data$Date)

>1day_prior

Date               Rain
1/02/1995 01:00    0.0
1/02/1995 02:00    1.87
1/02/1995 03:00    0.0
1/02/1995 04:00    0.0
1/02/1995 05:00    0.0
1/02/1995 06:00    0.0
1/02/1995 07:00    0.0
1/02/1995 08:00    0.0
1/02/1995 09:00    0.0
1/02/1995 10:00    0.0
1/02/1995 11:00    0.60
1/02/1995 12:00    0.0
1/02/1995 13:00    0.0
1/02/1995 14:00    0.0
1/02/1995 15:00    0.0
1/02/1995 16:00    0.0
1/02/1995 17:00    0.0
1/02/1995 18:00    0.50
1/02/1995 19:00    0.0
1/02/1995 20:00    0.0
1/02/1995 21:00    0.0
1/02/1995 22:00    0.0
1/02/1995 23:00    0.0
1/02/1995 24:00    0.0
8/03/1995 01:00    0.0 
8/03/1995 02:00    0.78 and so forth

Please do let me know if I need to provide any clarifications/edits to make this a better worded question.

Community
  • 1
  • 1
Sally
  • 23
  • 2

1 Answers1

1

To get the range of prior dates you could do this in base R

# first get the sequence of 5 prior dates 
dates = do.call("c", lapply(split(wq_data, wq_data$Date), 
                     function(x) seq(as.Date(x$Date)-5, as.Date(x$Date)-1, 1)))

# use the sequence to select the dates from second data frame
Data_ppt[as.Date(Data_ppt$Date) %in% dates,]
Veerendra Gadekar
  • 4,452
  • 19
  • 24