I'm trying to get the weighted row mean of certain columns by group.
library(tidyverse)
set.seed(1)
df <- data.frame(group = rep(LETTERS[3:4], each = 10),
x = runif(n = 10, min = 10, max = 15),
y = runif(n = 10, min = 100, max = 150),
z = runif(n = 10, min = 100, max = 150))
df
# group x y z
# 1 C 11.32754 110.2987 146.7353
# 2 C 11.86062 108.8278 110.6071
# 3 C 12.86427 134.3511 132.5837
# 4 C 14.54104 119.2052 106.2778
# 5 C 11.00841 138.4921 113.3610
# 6 C 14.49195 124.8850 119.3057
# 7 C 14.72338 135.8809 100.6695
# 8 C 13.30399 149.5953 119.1194
# 9 C 13.14557 119.0018 143.4845
# 10 C 10.30893 138.8723 117.0174
# 11 D 11.32754 110.2987 146.7353
# 12 D 11.86062 108.8278 110.6071
# 13 D 12.86427 134.3511 132.5837
# 14 D 14.54104 119.2052 106.2778
# 15 D 11.00841 138.4921 113.3610
# 16 D 14.49195 124.8850 119.3057
# 17 D 14.72338 135.8809 100.6695
# 18 D 13.30399 149.5953 119.1194
# 19 D 13.14557 119.0018 143.4845
# 20 D 10.30893 138.8723 117.0174
To get crude row mean of x
, y
, z
, I can do:
df %>%
mutate(rmean = pmap_dbl(list(x, y, z), ~mean(c(...))))
But I want to weight them by these weights
dfweight <- data.frame(group = c("C", "C", "C",
"D", "D", "D"),
cat = c("x", "y", "z",
"x", "y", "z"),
weights = c(.2, .7, .1,
.4, .1, .5))
# group cat weights
# 1 C x 0.2
# 2 C y 0.7
# 3 C z 0.1
# 4 D x 0.4
# 5 D y 0.1
# 6 D z 0.5
I thought I should extract the weights first:
dfweight_split <- lapply(split(dfweight, dfweight$group), function (x) x$weights)
dfweight_split
# $C
# [1] 0.2 0.7 0.1
# $D
# [1] 0.4 0.1 0.5
But I'm then unsure how to pmap/map
over these?
df %>%
group_by(group) %>%
mutate(wmean = pmap_dbl(list(x, y, z), ~weight.mean(c(..., dfweight_split))))
#OR
df %>%
group_by(group) %>%
mutate(wmean = map2(list(x, y, z), dfweight_split, ~weight.mean(.x, .y)))
Happy to see base
solutions too. A similar post is here.
thanks