There are two questions about dplyr, which in my case are related by the problem I am trying to solve:
- How can I cross-classify a
data_frame
using pipes, when trying to pass the resultant of a series of operations toxtabs
? - The argument of a pipe is usually denoted by
.
indplyr
&magrittr
, but this is also the token used to denote everything else in the formula interface. I know that there is an open issue ondplyr
somewhere (can't locate it right now) which talks about replacing.
with_
.
Here is an example:
wakefield::r_data_frame(
n = 100,
cat1 = r_sample_factor(x = LETTERS[1:3]),
cat2 = r_sample_factor(x = LETTERS[1:3]),
cat3 = r_sample_factor(x = LETTERS[1:3]),
bin1 = r_sample_logical()
) %>%
dplyr::filter(bin1) %>%
xtabs(. ~ cat1 + cat2 + cat3, data = .)
which fails with the output:
Error in model.frame.default(formula = . ~ cat1 + cat2 + cat3, data = .) :
invalid type (list) for variable '.'
because magrittr
is replacing the first .
with the resultant data_frame
of the previous computations. One way is to omit the first period altogether, like so:
wakefield::r_data_frame(
n = 100,
cat1 = r_sample_factor(x = LETTERS[1:3]),
cat2 = r_sample_factor(x = LETTERS[1:3]),
cat3 = r_sample_factor(x = LETTERS[1:3]),
bin1 = r_sample_logical()
) %>%
dplyr::filter(bin1) %>%
xtabs( ~ cat1 + cat2 + cat3, data = .)
But what if the .
needed to go on the other side of the formula
?
Edit:
As pointed out by @MrFlick, xtabs
does not take a RHS .
anyway. I thought that this problem could just as well be exemplified using the RHS .
conflict which I expected using the code:
wakefield::r_data_frame(
n = 100,
cat1 = r_sample_factor(x = LETTERS[1:3]),
cat2 = r_sample_factor(x = LETTERS[1:3]),
cat3 = r_sample_factor(x = LETTERS[1:3]),
bin1 = r_sample_logical()
) %>%
dplyr::filter(bin1) %>%
dplyr::select(-bin1) %>%
xtabs( ~ ., data = .)
but this does work exactly as expected. Can someone explain why magrittr
is not trying replace the first .
with the data_frame
?