I have a data frame that consists of 4 columns that represent questions, and each column as 4 levels that represent responses.
Q1 Q2
1 A A
2 A B
3 B B
4 C C
5 D D
And I'd like to derive a data.frame
such as this:
question response percent
1 Q2 A 0.2
2 Q2 B 0.4
3 Q2 C 0.2
4 Q2 D 0.2
5 Q1 A 0.4
6 Q1 B 0.2
7 Q1 C 0.2
8 Q1 D 0.2
So far, I've been achieving this with a for
loop, but my scripts are riddled with for
loops so I'd like to achieve this using functions in reshape2
or with lapply
. For instance this code is a lot cleaner than a for
loop but still not quite what I'm looking for. Any help would be greatly appreciated!
Here's what I've got so far:
lapply(lapply(df, summary), function(x) x/sum(x))
EDIT: Including example of data frame per request. I was originally afraid it would take up too much space since the level labels are so long, so I shortened them.
dput(df[1:4,])
structure(list(Q1 = structure(c(4L, 4L, 1L, 4L), .Label = c("1.A",
"1.B", "1.C", "1.D"), class = "factor"),
Q2 = structure(c(4L, 4L, 4L, 1L), .Label = c("2.A","2.B",
"2.C", "2.D"), class = "factor"),
Q3 = structure(c(4L, 3L, 4L, 4L), .Label = c("3.A","3.B",
"3.C","3.D"), class = "factor"),
Q4 = structure(c(3L, 1L, 3L, 3L), .Label = c("4.A","4.B",
"4.C","4.D")),
.Names = c("Q1.pre", "Q2.pre", "Q3.pre", "Q4.pre"), row.names = c(NA, 4L),
class = "data.frame")
I've found that a combination of Lafortune and user20650's responses has given me almost exactly what I've been looking for:
melt(sapply(df, function(x) prop.table(table(x))))
However there's one problem. At the sapply
level, the dimnames
are the same as the label names of the levels for Q1, and so after performing melt
the output of sapply
, the Var1 column is just a repetition of Q1s levels, whereas I'd like Var1 to have Q1's levels in the Q1 rows, Q2's levels in the Q2 rows, etc. I found a workaround by pulling the levels
of all of the columns into a separate variable qnames
before performing any operations on df
like so:
qnames = melt(sapply(df, levels))
qnames = qnames[ ,3]
melt(sapply(df, function(x) prop.table(table(x))))
df = cbind(qnames, df)
Which is exactly the result I need. I'm interested to see if there is a way to achieve this without the extra sapply
and cbind
, so I'll leave the question open a little longer. Thanks for your help!