3

I am trying to build a function to obtain a weighted mean at the same variable in different dataframes in a list. The function is not taking some arguments (wage and weight), I believe there is a "" or [[]] problems but I can't seem to make it work.

Here's the reproducible example that gives me the error

set.seed(555)
lista <- list(A = data.frame(wage = (runif(10, min=50, max=100)), weight = (runif(10, min=0, max=1))),
B = data.frame(wage = (runif(10, min=55, max=105)), weight = (runif(10, min=0.1, max=1))))
list


wmeanf <- function(df, x, w) {
  mean <- df %>% summarise (weighted.mean(x,w))
  mean
}


twmean <- sapply(lista, function (X) wmeanf (df = X, x = wage, w = weight))

Thanks!

Juan C
  • 301
  • 1
  • 11

2 Answers2

2

There are several ways to accomplish this. Hopefully one of these gets you going in the right direction:

library(tidyverse)

set.seed(555)
lista <- list(A = data.frame(wage = (runif(10, min=50, max=100)), weight = (runif(10, min=0, max=1))),
              B = data.frame(wage = (runif(10, min=55, max=105)), weight = (runif(10, min=0.1, max=1))))

map(lista, ~ weighted.mean(x = .$wage, w = .$weight))
#> $A
#> [1] 75.60411
#> 
#> $B
#> [1] 70.22652
lapply(lista, function(x) { weighted.mean(x = x$wage, w = x$weight) })
#> $A
#> [1] 75.60411
#> 
#> $B
#> [1] 70.22652
sapply(lista, function(x) { weighted.mean(x = x$wage, w = x$weight) })
#>        A        B 
#> 75.60411 70.22652

Created on 2020-05-05 by the reprex package (v0.3.0)

JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
  • Thanks Jason, in this case my function can be directly omitted and run weighted mean directly over the list really :) But if I wanted to use it, or a more complex one, what would be the right grammar to make it / call it? – Juan C May 05 '20 at 12:47
  • 1
    @JuanC The grammar should be more or less the same, provided the function is written correctly. Programming with `dplyr` is a bit more involved because it utilizes non-standard evaluation [see this vignette](https://dplyr.tidyverse.org/articles/programming.html). Regardless, if you want to do something more involved with a custom function and get stuck, just use the search here and/or post another reproducible example. – JasonAizkalns May 05 '20 at 13:10
  • Thanks, that is a helpful vignette, will check that! – Juan C May 05 '20 at 14:00
0

After @Jason's suggestion to look here about Dplyr evaluation and quoting I found a way to make my original intended function work:

set.seed(555)
lista <- list(A = data.frame(wage = (runif(10, min=50, max=100)), weight = (runif(10, min=0, max=1))),
              B = data.frame(wage = (runif(10, min=55, max=105)), weight = (runif(10, min=0.1, max=1))))

wmeanf <- function(df, x, w) {

  x <- enquo(x)
  w <- enquo(w)

  mean <- df %>% summarise (weighted.mean(!!x,!!w))
  mean
}

sapply(lista, function (X) wmeanf (df = X, x = wage, w = weight))

$`A.weighted.mean(wage, weight)`
[1] 75.6041053069

$`B.weighted.mean(wage, weight)`
[1] 70.2265239366
Juan C
  • 301
  • 1
  • 11