14

Now that by_row() in purrr is going to be (is?) deprecated, what is the new preferred tidyverse implementation of:

somedata = expand.grid(a=1:3,b=3,c=runif(3))
somedata %>%
  rowwise() %>% do(binom.test(x=.$a,n=.$b,p=.$c) %>% tidy())

It seems as if you might nest each row into a single column, and then use map(), but I'm not sure how to do that nesting operation...plus it seems like that's a little obscure. Is there a better way?

Eli Berkow
  • 2,628
  • 1
  • 12
  • 22
Nicholas Root
  • 535
  • 3
  • 15
  • I have a 100,000 row tibble. Rowwise is incredibly slow. Any ideas how to do a more efficinet opperation? – jzadra Apr 20 '18 at 17:00

1 Answers1

13

Here is one way with map

library(tidyverse)
library(broom)
do.call(Map, c(f = binom.test, unname(somedata))) %>%
      map_df(tidy)
#  estimate statistic    p.value parameter    conf.low conf.high              method alternative
#1 0.3333333         1 1.00000000         3 0.008403759 0.9057007 Exact binomial test   two.sided
#2 0.6666667         2 0.25392200         3 0.094299324 0.9915962 Exact binomial test   two.sided
#3 1.0000000         3 0.03571472         3 0.292401774 1.0000000 Exact binomial test   two.sided
#4 0.3333333         1 0.14190440         3 0.008403759 0.9057007 Exact binomial test   two.sided
#5 0.6666667         2 0.55583967         3 0.094299324 0.9915962 Exact binomial test   two.sided
#6 1.0000000         3 1.00000000         3 0.292401774 1.0000000 Exact binomial test   two.sided
#7 0.3333333         1 0.58810045         3 0.008403759 0.9057007 Exact binomial test   two.sided
#8 0.6666667         2 1.00000000         3 0.094299324 0.9915962 Exact binomial test   two.sided
#9 1.0000000         3 0.25948735         3 0.292401774 1.0000000 Exact binomial test   two.sided

Or with only tidyverse functions

somedata %>%
     unname %>%
     pmap(binom.test) %>% 
     map_df(tidy)
#estimate statistic    p.value parameter    conf.low conf.high              method alternative
#1 0.3333333         1 1.00000000         3 0.008403759 0.9057007 Exact binomial test   two.sided
#2 0.6666667         2 0.25392200         3 0.094299324 0.9915962 Exact binomial test   two.sided
#3 1.0000000         3 0.03571472         3 0.292401774 1.0000000 Exact binomial test   two.sided
#4 0.3333333         1 0.14190440         3 0.008403759 0.9057007 Exact binomial test   two.sided
#5 0.6666667         2 0.55583967         3 0.094299324 0.9915962 Exact binomial test   two.sided
#6 1.0000000         3 1.00000000         3 0.292401774 1.0000000 Exact binomial test   two.sided
#7 0.3333333         1 0.58810045         3 0.008403759 0.9057007 Exact binomial test   two.sided
#8 0.6666667         2 1.00000000         3 0.094299324 0.9915962 Exact binomial test   two.sided
#9 1.0000000         3 0.25948735         3 0.292401774 1.0000000 Exact binomial test   two.sided
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Does the function call in pmap allow you to pass arguments? For example, if instead you wanted the "p" argument in binom.test to be "c-0.5", I'd want to do something like pmap(binom.test(p=.z-0.5)) but that obviously doesn't work. Is there an equivalent? – Nicholas Root Jun 28 '17 at 19:10
  • @NicholasRoot I guess you need `pmap(~binom.test(., p = z -0.5))` – akrun Jun 29 '17 at 03:28
  • 2
    Note that you can avoid the `unname` if you use column names in `somedata` that match the arguments of the function (`binom.test` in this case). That would be more explicit and thus probably safer. – cboettig Jul 21 '17 at 04:09