Here is the data I am working with:
x <- getURL("https://raw.githubusercontent.com/dothemathonthatone/maps/master/testmain.csv")
data <- read.csv(text = x)
I want to make a dummy variable for the top, middle, and lower third of the values in year_hh_inc
. Every value in my id column reg_schl
potentially has more than one value for year_hh_inc
, so the dummy variable needs to group on reg_schl
.
I want to be able to differentiate the values in year_hh_inc
within each unique reg_schl
.
so far I have the following which is posted below as solution from Sotos:
data %>%
group_by(reg_schl) %>%
mutate(category = cut(year_hh_inc, breaks = (quantile(year_hh_inc, c(0, 1 / 3, 2 / 3, 1), na.rm = TRUE)), labels = c("low", "middle", "high"), include.lowest = TRUE), vals = 1) %>%
pivot_wider(names_from = category, values_from = vals, values_fill = list(vals = 0))
This is working well.
I have also used this solution provided by Allan:
cut_by_id <- function(x)
{
x$category <- cut(x$year_hh_inc, quantile(x$year_hh_inc, c(0,1/3,2/3,1), na.rm = TRUE),
labels = c("low","middle","high"), include.lowest = TRUE)
return(x)
}
data <- do.call(rbind, lapply(split(data, data$id), cut_by_id))