I want to take a dataset and split it into multiple datasets. For a simplified verson of the problem. Realistically, I will have thousands of rows but I would like to simplify the problem for the purpose of understanding. Suppose you have the following code:
vec = c(1:10)
df = data.frame(vec)
df
vec
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
I would like to split this dataset into rows of 5 observations each and then get the mean for each 5 rows.
So far i've tried to split the code in the following manner:
splitdf = split(df, rep(1:2,each = 5))
Now I would like to get the mean of each group. For example, the mean of the first chunk is 3 and the second chunk is 8.
Then, I would like to do a rep function and store it in a separate column. I want my data frame to look like the following:
vec mean
1 1 3
2 2 3
3 3 3
4 4 3
5 5 3
6 6 8
7 7 8
8 8 8
9 9 8
10 10 8
I was wondering whether a loop function would be appropriate or if there's a simpler way to go about this problem. I am open to suggestions.