I would like to combine cut
with group_by
, but it is not working out. I tried to follow the recommendations of this thread Using cut() with group_by() but still it did not work.
Here is a reproducible code:
library(dplyr)
set.seed(1)
df <- tibble(
V1 = round(runif(1000,min=1, max=1000)),
V2 = round(runif(1000, min=1, max=3)),
V3 = round(runif(1000, min=1, max=10)))
df$V2 = as.factor(df$V2)
df$V3 = as.factor(df$V3)
df$split= cut(df$V1, quantile(df$V1, c(0, .2, .6, 1)), include.lowest = TRUE)
Here is how I successfully combined group_by
with ntile
function.
df=df %>%
group_by(V2,V3) %>%
mutate(quartile_by_group = ntile(V1,4))
But that does not work when I combine it with cut
. We can see clearly that we have dozens instead of only three categories.
df=df %>%
group_by(V2, V3) %>%
mutate(split_by_group = cut(V1, quantile(V1, c(0, .2, .6, 1)), include.lowest = TRUE))
table(df$split_by_group)