Reshape acast() remove missing values

Question

I have this dataframe:

df <- data.frame(subject = c(rep("one", 20), c(rep("two", 20))),
                 score1 = sample(1:3, 40, replace=T),
                 score2 = sample(1:6, 40, replace=T),
                 score3 = sample(1:3, 40, replace=T),
                 score4 = sample(1:4, 40, replace=T))

   subject score1 score2 score3 score4
1      one      2      4      2      2
2      one      3      3      1      2
3      one      1      2      1      3
4      one      3      4      1      2
5      one      1      2      2      3
6      one      1      5      2      4
7      one      2      5      3      2
8      one      1      5      1      3
9      one      3      5      2      2
10     one      2      3      3      4
11     one      3      2      1      3
12     one      2      5      2      1
13     one      2      4      1      4
14     one      2      2      1      3
15     one      1      3      1      4
16     one      1      6      1      3
17     one      3      4      2      2
18     one      3      2      1      3
19     one      2      5      3      1
20     one      3      6      2      1
21     two      1      6      3      4
22     two      1      2      1      2
23     two      3      2      1      2
24     two      1      2      2      1
25     two      2      3      1      3
26     two      1      5      3      3
27     two      2      4      1      4
28     two      2      6      2      4
29     two      1      6      2      2
30     two      1      5      1      4
31     two      2      1      2      4
32     two      3      6      1      1
33     two      1      1      3      1
34     two      2      4      2      3
35     two      2      1      3      2
36     two      2      3      1      3
37     two      1      2      3      4
38     two      3      5      2      2
39     two      2      1      3      4
40     two      2      1      1      3

Note that the scores have different ranges of values. Score 1 ranges from 1-3, score 2 from -6, score 3 from 1-3, score 4 from 1-4

I'm trying to reshape data like this:

library(reshape2)
dfMelt <- melt(df, id.vars="subject")

acast(dfMelt, subject ~ value ~ variable)

Aggregation function missing: defaulting to length
, , score1

    1 2 3 4 5 6
one 6 7 7 0 0 0
two 8 9 3 0 0 0

, , score2

    1 2 3 4 5 6
one 0 5 3 4 6 2
two 5 4 2 2 3 4

, , score3

     1 2 3 4 5 6
one 10 7 3 0 0 0
two  8 6 6 0 0 0

, , score4

    1 2 3 4 5 6
one 3 6 7 4 0 0
two 3 5 5 7 0 0

Note that the output array includes scores as "0" if they are missing. Is there any way to stop these missing scores being outputted by acast?

The return value of `acast` is an array, so the only alternative is `NA` and you can transform 0 to `NA` afterwards. — Roland, Jul 14 '13 at 15:39
@Roland, you can use the `fill` parameter to `acast`: `acast(dfMelt, subject ~ value ~ variable, fill=NA_integer_)` to directly get `NA`. — Arun, Jul 14 '13 at 16:22
@luciano, when you say "stop them from outputting" do you mean output to screen, or do you want them to not be included in each array? — Ricardo Saporta, Jul 14 '13 at 17:23
What do you envision the result looking like? It appears @Ananda Mahto gave you a great answer — Ricardo Saporta, Jul 14 '13 at 18:30

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2013-07-14T15:44:02.463

In this case, you might do better sticking to base R's table feature. I'm not sure that you can have an irregular array like you are looking for.

For example:

> lapply(df[-1], function(x) table(df[[1]], x))
$score1
     x
       1  2  3
  one  9  6  5
  two 11  4  5

$score2
     x
      1 2 3 4 5 6
  one 2 5 4 3 3 3
  two 4 2 2 3 4 5

$score3
     x
       1  2  3
  one  9  5  6
  two  4 11  5

$score4
     x
      1 2 3 4
  one 4 4 8 4
  two 2 6 5 7

Or, using your "long" data:

with(dfMelt, by(dfMelt, variable, 
                FUN = function(x) table(x[["subject"]], x[["value"]])))

score 1 · Answer 2 · answered Jul 14 '13 at 18:29

Since each "score" subset is going to have a different shape, you will not be able to preserve the array structure. One option is to use lists of two-dim arrays or data.frames. eg:

# your original acast call
res  <-  acast(dfMelt, subject ~ value ~ variable)

# remove any columns that are all zero
apply(res, 3, function(x) x[, apply(x, 2, sum)!=0] )

Which gives:

$score1
    1 2 3
one 7 8 5
two 6 8 6

$score2
    1 2 3 4 5 6
one 4 2 6 4 1 3
two 2 5 3 4 3 3

$score3
    1  2 3
one 5 10 5
two 5 11 4

$score4
    1 2 3 4
one 5 4 4 7
two 4 6 6 4

Reshape acast() remove missing values

2 Answers2