I have a large data set (49000 X 118) and what I would like to do is I want to group by one column then have the summary of multiple columns. The issue with my data is that the summary of each column has a different length.
Here is a simple example of my data
dat<- data.frame(test_number= as.factor(c("test1", "test1", "test1","test1","test1","test1", "test2","test2","test2", "test3","test3","test3","test3","test3","test3")),
question1_response= as.factor(c("yes", NA, "no","not answered", "yes", "yes", NA, "no","yes","yes","yes","yes","yes","yes","yes")),
question2_response= as.factor(c("yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","yes","no")),
question3_response= as.factor(c("yes", NA, "no","yes", NA, "no","yes", NA, "no","yes", NA, "no","yes", NA, "no")))
I would like to group by test_number
and get a summary of each response in columns 2:4
some of the codes I have tried:
summary1<- dat %>%
group_by(test_number) %>%
group_map(~summarize(.x, across(everything(), summary)))
lapply(dat[-1],
FUN = function(x) { group_by(test_number) %>% summary(factor(x)) })
dat %>%
group_by(test_number) %>%
lapply(dat[c(2:4)], FUN = function(x) summary(x))
I expect the result to be something like this (i did it in excel)
I replaced the unequal column length with NAs but i am not particular about the structure as long as i get the information.
Thank you