I have a function checking zero numbers in each column in a large dataframe. Now I want to check zero numbers in each col after grouped by category. Here is the example:
zero_rate <- function(df) {
z_rate_list <- sapply(df, function(x) {
data.frame(
n_zero=length(which(x==0)),
n=length(x),
z_rate=length(which(x==0))/length(x))
})
d <- data.frame(z_rate_list)
d <- sapply(d, unlist)
d <- as.data.frame(d)
return(d)}
df = data.frame(var1=c(1,0,NA,4,NA,6,7,0,0,10),var2=c(11,NA,NA,0,NA,16,0,NA,19,NA))
df1= data.frame(cat = c(1,1,1,1,1,2,2,2,2,2),df)
zero_rate_df = df1 %>% group_by(cat) %>% do( zero_rate(.))
Here zero_rate(df) works just as I expected. But when I group the data by cat and calculate in each category the zero_rate for each column, the result is not as I expected. I expect something like this:
cat va1 var2
1 n_zero 1 1
n 5 5
z_rate 0.2 0.2
2 n_zero 2 1
n 5 5
z_rate 0.4 0.2
Any suggestion? Thank you.