2

I am struggling to learn how to program using the hadleyverse. I've read the NSE and lazyeval vignettes, but I'm still lost...

I'm trying to translate the example given on the tidyr::complete help page to an SE case.

df <- data_frame(
  group = c(1:2, 1),
  item_id = c(1:2, 2),
  item_name = c("a", "b", "b"),
  value1 = 1:3,
  value2 = 4:6
)
df %>% complete(group, nesting(item_id, item_name))

My ultimate goal is to be able to do the same thing with my variables specified as:

v1 <- 'group'
v2 <- 'item_id, item_name'

But before I try that, I need to be able to work it out using column names directly. To get started, even though I know it's not what I want, this at least doesn't throw an error:

df %>% complete_(list(~group, ~item_id, ~item_name))

What I can't figure out is how to include the 'nesting_'
Things I've tried:

df %>% complete_(~group, nesting_(~item_id, ~item_name)) 
# Error in nesting_(~item_id, ~item_name) : unused argument (~item_name)

df %>% complete_(~group, nesting_(list(~item_id, ~item_name)))
# Error: Each variable must be named. 
# Problem variables: 1, 2 

df %>% complete_(~group, nesting_(alist(~item_id, ~item_name)))
# Error: Each variable must be named. 
# Problem variables: 1, 2 

df %>% complete_(~group, nesting_(list('item_id' = item_id, 'item_name' = item_name)))
# Error in stopifnot(is.list(x)) : object 'item_id' not found

df %>% complete_(~group, nesting_(list('item_id' = df$item_id, 'item_name' = df$item_name)))
# No syntax error, but doesn't expand...

df %>% complete_(~group, nesting_(named_dots(item_id, item_name)))
# Error: Each variable must be a 1d atomic vector or list.
# Problem variables: 'item_id', 'item_name'

df %>% complete_(~group, nesting_(list('item_id' = item_id, 'item_name' = item_name)))
# Error in stopifnot(is.list(x)) : object 'item_id' not found

df %>% complete_(~group, nesting_(list(as.name(item_id), as.name(item_name))))
# Error in as.name(item_id) : object 'item_id' not found

df %>% complete_(~group, nesting_(as.name(item_id), as.name(item_name)))
# Error in nesting_(as.name(item_id), as.name(item_name)) : 
#   unused argument (as.name(item_name))

Thanks for any help!!

ap53
  • 783
  • 2
  • 8
  • 19

3 Answers3

2

Thanks to @aosmith 's suggestions, I hacked this workaround.
It's probably not the best/correct way, but it seems to work.

Starting with the last of his statements that do work:

v1 <- 'group'
v2 <- c("item_id", "item_name")
df %>% complete_(list(as.name(v1), ~nesting_(setNames(list(item_id, item_name), v2))))

I played around with the setNames call, to see what it did:

setNames(list(df$item_id, df$item_name), v2)
$item_id
[1] 1 2 2

$item_name
[1] "a" "b" "b"

and realized it was just subsetting the df columns named in v2. So I tried to do the same thing via select_:

df %>% complete_(list(as.name(v1), ~nesting_(select_(., .dots = v2))))
# A tibble: 4 × 5
  group item_id item_name value1 value2
  <dbl>   <dbl>     <chr>  <int>  <int>
1     1       1         a      1      4
2     1       2         b      3      6
3     2       1         a     NA     NA
4     2       2         b      2      5
ap53
  • 783
  • 2
  • 8
  • 19
1

I got complete_ and nesting_ to work together like this:

df %>% complete_(list(~group, ~nesting_(list(item_id = item_id, item_name = item_name))))

Looking at the code for nesting_, it looks like named list comes from the use of tibble::as_data_frame.

However, the code above doesn't help much when you actually start using your named variables. Things still work with complete_ OK:

df %>% complete_(list(as.name(v1), ~nesting_(list(item_id = item_id, item_name = item_name))))

And you can make the named list for nesting_ via setNames and a vector of the names:

v2 <- c("item_id", "item_name")
df %>% complete_(list(as.name(v1), ~nesting_(setNames(list(item_id, item_name), v2))))

But I didn't find a solution to work with the list of names for nesting_. My failures involved things like

df %>% complete_(list(as.name(v1), ~nesting_(setNames(lapply(v2, as.name), v2))))

Error: Each variable must be a 1d atomic vector or list. Problem variables: 'item_id', 'item_name'

I didn't try much beyond that, but it may give you a starting point.

aosmith
  • 34,856
  • 9
  • 84
  • 118
  • I'll still have to study ;-) But the question as I asked it is answered. Thank you! – ap53 Nov 07 '16 at 12:20
0

Another possibility is:

v1 <- 'group'
v2 <- c('item_id', 'item_name')
df %>% complete_(c(v1, ~do.call(nesting, lapply(v2, as.name))))

  group item_id item_name value1 value2
  <dbl>   <dbl>     <chr>  <int>  <int>
1     1       1         a      1      4
2     1       2         b      3      6
3     2       1         a     NA     NA
4     2       2         b      2      5

This doesn't use "SE" nesting_(), rather it takes advantage of the fact that the arguments to complete can be lazily evaluated. I'm not convinced this is preferable to ap53's answer above, but it does remove the explicit use of select().

stephematician
  • 844
  • 6
  • 17