I have a dataframe of return values of cryptocurrencies, each sorted into a group of 5 based on price (dataframe has 1,847,012 entries)
2014-01-02 XRP 0.064041998 1
2014-01-02 Dogecoin -0.220306670 1
2014-01-02 FLO 0.043457042 1
2014-01-02 CasinoCoin -0.080614279 2
2014-01-02 Nxt 0.048375563 2
2014-01-02 Quark 0.004495892 2
2014-01-02 Feathercoin 0.218078384 3
2014-01-02 Diamond 0.052490183 3
2014-01-02 Unobtanium -0.107420249 4
2014-01-02 Peercoin 0.263435789 4
2014-01-02 Primecoin 0.230874782 4
2014-01-02 Bitcoin 0.039387728 5
2014-01-02 Litecoin 0.045263780 5
2014-01-02 Namecoin 0.318292245 5
2014-01-02 Ethereum NA NA
... ... ... ...
2014-01-03 XRP -0.03900908 1
2014-01-03 Dogecoin -0.15273525 1
2014-01-03 FLO 0.02397348 1
2014-01-03 CasinoCoin 0.05748349 2
... ... ... ...
The date going up to 2019-12-31 (Some values are not available if the coin didnt exist at that time, like Ethereum in 2014).
What i want to do is create a table that takes the mean of each group, for each day, so
2014-01-02 2014-01-03 2014-01-04 ...
1 mean(group 1) mean(group 1)
2 mean(group 2) mean(group 2)
3 mean(group 3) mean(group 3)
4 mean(group 4) mean(group 4)
5 mean(group 5) mean(group 5)
I tried to look up and found Aggregate / summarize multiple variables per group (e.g. sum, mean) What i tried was:
means = as.data.frame(aggregate(d$value, list(d$Group), mean, na.rm = TRUE))
But this only does it for each group, over all dates, and gives me 5 values overall, where i need 5 values each day
Group.1 x
1 -4.920999e-03
2 -3.372798e-03
3 -1.548296e-03
4 -5.959693e-05
5 6.303165e-04
I also tried using data.table
df3 <- setDT(d)[, lapply(.SD, mean), by=.(date, Group), .SDcols=c("date","Group")]
But this is not able to read the dates