Is there a way to summarise my questionnaire data to show the number of each response for each question?

Question

My data looks a bit like this:

Q1  Q2  Q3  Q4
S   NS  S   S
NS  .   S   PMI
S   S   TMI S
PMI S   S   NS

The only options are S, TMI, PMI, NS (or missing, which is a .). I would like it to look like this (df called PCP):

    Q1  Q2  Q2  Q4
S   2   2   3   2
NS  1   1   0   1
PMI 1   0   0   1
TMI 0   0   1   0
.   0   1   0   0

I can do it in 2 steps with this code:

Counts <-  melt(table(PCP$Q1)) %>%
  join(y = melt(table(PCP$Q2)),  type = "full") %>%
  join(y = melt(table(PCP$Q3)),  type = "full") %>%
  join(y = melt(table(PCP$Q4)),  type = "full")
Counts <- melt(Counts, key='Var1')

Bur is there a nice way of doing this in one step maybe using dplyr/plyr packages?

Thanks for the response! Unfortunately stack just gives the error: "Error in stack.data.frame(PCP) : no vector columns were selected" — morgan121, Oct 18 '18 at 05:30
That's because you have `factor` columns instead of `character` columns in your data.frame (and rightfully so - stacking/gathering/melting data from different categorical variables with potentially different labels doesn't make sense). It wasn't clear from your question but I assure you it works if you read your data in with `stringsAsFactors=FALSE` — thelatemail, Oct 18 '18 at 05:31
right you are @thelatemail!! What a legend, works very well now :) — morgan121, Oct 18 '18 at 05:38

score 0 · Accepted Answer · answered Oct 18 '18 at 05:29

How about this:

library(tidyverse)

PCP %>% 
  gather(question, answer) %>%
  mutate(answer = replace(answer, answer==".", "Missing")) %>% 
  count(question, answer) %>%
  spread(question, n, fill=0)

  answer     Q1    Q2    Q3    Q4
1 Missing     0     1     0     0
2 NS          1     1     0     1
3 PMI         1     0     0     1
4 S           2     2     3     2
5 TMI         0     0     1     0

score 0 · Answer 2 · answered Oct 18 '18 at 05:30

0

Using tidyverse and reshape2:

df %>%
  gather(var, val) %>%
  group_by(var) %>%
  add_count(val) %>%
  dcast(val ~ var, value.var = "n")

  val Q1 Q2 Q3 Q4
1   .  0  1  0  0
2  NS  1  1  0  1
3 PMI  1  0  0  1
4   S  2  2  3  2
5 TMI  0  0  1  0

answered Oct 18 '18 at 05:30

tmfmnk

38,881
4
47
67

Is there a way to summarise my questionnaire data to show the number of each response for each question?

2 Answers2