0

I've got 7 years of temperature data split into 4 seasonal variables (Spring, Summer, Autumn, Winter) each of which look like this (Spring example)

Day Month Year  maxtp  Season.Year Season
 1    3   2008   13.6     2008       SP
 2    3   2008   11.3     2008       SP
 3    3   2008   5.4      2008       SP

I want to create a multiple new temperature series based on these observed data, one at a time in the following way (using a similar approach to this): Block sampling according to index in panel data

Using this code

newseries1 <- sample(Spring, size=91, replace = T, prob = NULL)

But this replicated the series 91 times, and isn't what I want.

I want to select an entire Spring block from any random season.year (2008-2014), then select a summer block from any year EXCEPT the year that was chosen previously, so any year other than 2008. The resampled year is then replaced so it can be resampled again the next time, just not consecutively.

I want to take a season.year from the spring variable, follow it with a different season.year for the summer variable, then another for autumn, and another for winter, and keep doing this until the resampled is the same length as the observed (7 years in this case).

So in summary I want to:

  1. Select a 'block' respecting the annual sequence (Spring from a random season.year) and begin a new series with it, then replace it so it can be sampled again.
  2. Follow Spring with summer from a non-consecutive year, and replace it.
  3. Keep going until the resampled series is the same length as the observed
  4. Repeat this process until there are 100 resampled series
Community
  • 1
  • 1
Pad
  • 841
  • 2
  • 17
  • 45

1 Answers1

0

For newseries1 try instead

ndays <- length(Spring[, 1])
#select rows of Spring randomly (are you sure you want replace = T?)
newseries1 <- Spring[sample(1:ndays, size = ndays, replace = T, prob = NULL),]

Then for selecting the year data for each season successively:

y.lst <- 2008:2014
nssn <- 7*100*4 #desired number of annual cycles times four seasons
y <- rep(NA, nssn) #initialise: vector of selected years
#first spring
y[1] <- sample(y.lst, 1)
#subsequent seasons
for(s in 2:nssn){
  #selects a year from a sublist of years which excludes that of the previous season
  y[s] <- sample(y.lst[y.lst != y[s - 1]], 1)
}

Then compile the data frame (assume original data is in data frame data):

#first Spring
Ssn <- data[with(data, Year == y[1] & Season == "SP"),]
ndays <- length(Spring[, 1])
newseries1 <- Ssn[sample(1:ndays, size = ndays, replace = T, prob = NULL),]
#initialise data frame
data2 <- Ssn
#subsequent seasons
for(s in 2:nssn){
  Ssn <- data[with(data, Year == y[s] & Season == "..."),]
  ndays <- length(Spring[, 1])
  newseries1 <- Ssn[sample(1:ndays, size = ndays, replace = T, prob = NULL),]
  data2 <- rbind(data2, Ssn)
}

You will need to create a vector of season labels to be chosen. Use the %% remainder function to select the appropriate season label in each case (i.e. s%%4 is 2 implies "SU")

CJB
  • 1,759
  • 17
  • 26