I want to calculate the trend of each day over several years. For example the trend of May 1st from 2000 to 2010. Here is my test-dataframe:
library(lubridate)
date_list = seq(ymd('2000-01-15'),ymd('2010-09-18'),by='day')
testframe = data.frame(Date = date_list)
testframe$Day = substr(testframe$Date, start = 6, stop = 10)
testframe$V1 = rnorm(3900)
testframe$V2 = rnorm(3900)
testframe$V3 = seq(from = 10, to = 25, length.out = 3900)
testframe$V4 = seq(from = 5, to = 45, length.out = 3900)
V1 to V4 are the values. In testframe$Day I already cut out the day, so that I can use that to group the rows. I know that aggregate
is good for grouping in this way, but I am pretty clueless how to combine this with an linear model.
In the end I would like to have a dataframe that has a column that contains each single day (without the year of course) and columns that contain the trend/slope of the values from V1 to V4.
Any ideas?
UPDATE:
To make it more clearly. I want and output that looks like this (Trends are random)
Day V1 Trend V2 Trend V3 Trend V4 Trend
01-01 +0.3 +0.4 +0.9 +0.5
01-02 +0.5 +0.3 +0.8 +0.4
01-03 -0.1 -0.2 +1.0 -0.3
01-04 +0.7 -0.7 +0.9 +0.9
......
......
12-30 -0.3 -0.4 +0.5 +0.8
12-31 -0.7 -0.3 +0.6 +0.9
p-values, Intercept and all would be also great to have.
I found this example, but its still not in the output that I want to have:
#Add year for lm
testframe$Year = as.numeric(format(testframe$Date,'%Y'))
library(plyr)
# Break up d by state, then fit the specified model to each piece and
# return a list
models <- dlply(testframe, "Day", function(df)
lm(Year ~ V4, data = df))
# Apply coef to each model and return a data frame
ldply(models, coef)
# Print the summary of each model
l_ply(models, summary, .print = TRUE)