You can create a vector of all the target dates and sample from it. To create the vector, there is seq.Date
, the seq
method for objects of class "Date"
.
start <- as.Date("2008-01-01")
end <- as.Date("2010-12-31")
s <- seq(start, end, by = "days")
The vector s
includes all days between start
and end
. Now sample from it.
set.seed(123)
dat1 <- sample(s, 10000, TRUE)
Transform the sample into day-of-the-year. See help("strptime")
as.numeric(format(dat1, format = "%j"))
In the end, remove s
, it's no longer needed.
rm(s) # tidy up
Edit.
The following two functions do what the question asks for but with two different methods.
f1
is the code above wrapped in a function, f2
uses ave/seq_along/match
and is a bit more complicated. The tests show function f2
to be twice as fast than f1
f1 <- function(start_date, end_date, n){
start <- as.Date(start_date)
end <- as.Date(end_date)
s <- seq(start, end, by = "days")
y <- sample(s, n, replace = TRUE)
as.numeric(format(y, format = "%j"))
}
f2 <- function(start_date, end_date, n){
start <- as.Date(start_date)
end <- as.Date(end_date)
s <- seq(start, end, by = "days")
y <- sample(s, n, replace = TRUE)
z <- ave(as.integer(s), lubridate::year(s), FUN = seq_along)
z[match(y, s)]
}
set.seed(123)
x1 <- f1("2008-01-01", "2010-12-31", 100)
set.seed(123)
x2 <- f2("2008-01-01", "2010-12-31", 100)
all.equal(x1, x2)
#[1] TRUE
Now the tests.
library(microbenchmark)
mb <- microbenchmark(
f1 = f1("2008-01-01", "2010-12-31", 1e4),
f2 = f2("2008-01-01", "2010-12-31", 1e4),
times = 50
)
print(mb, order = "median")
ggplot2::autoplot(mb)
