1

Let say I have below data.frame

library(reshape2)

set.seed(1)

dat = data.frame(X1 = sample(letters, 10, replace = T), X2 = sample(letters, 10, replace = T), X3 = sample(LETTERS, 10, replace = T), X4 = sample(LETTERS[1:4], 10, replace = T), X5 = sample(11:13, 10, replace = T), X6 = sample(200:201, 10, replace = T))

dat

   X1 X2 X3 X4 X5  X6
1   q  w  J  B 12 201
2   t  c  U  A 12 200
3   q  c  W  A 11 200
4   b  u  E  C 11 201
5   p  m  C  B 13 201
6   g  t  V  A 12 201
7   t  d  F  C 13 201
8   x  t  P  D 11 201
9   d  e  E  D 13 200
10  l  m  L  D 13 201

Now I want to un-melt above data frame such that, the column will be the unique combination of the values of columns X4 & X5, value vector will be X6, which will corresponds to all rows except X4, X5, and X6 (my actual dataframe has many columns so I can not explicitly name those remaining columns)

So in above case, the columns of the final dataframe will be {X1, X2, X3, A-12 A-13, ... B-12, B-13, .. etc}

Can you please help me how to achieve this with dcast() function from reshape?

Thanks for your pointer.

Martin Gal
  • 16,640
  • 5
  • 21
  • 39
Brian Smith
  • 1,200
  • 4
  • 16
  • 1
    Are you looking for something like `dcast(dat, X1 + X2 + X3 ~ X4 + X5)`? – Martin Gal Jun 29 '21 at 21:27
  • Yes, except the fact that I can not explicitly mention `X1+X2+X3` because number of columns in my data frame is large. So it will be `X1+X2+...+Xn`. Therefore I need to use this expression programmatically – Brian Smith Jun 29 '21 at 21:43
  • 1
    So more like `dcast(dat, list(evalq(paste0("X",1:3)), .(X4, X5)))` with `library(plyr)`? – Martin Gal Jun 29 '21 at 21:50
  • 1
    Are you limited to `reshape2`? Because `tidyr` superseded reshape2. The replacement functions are `pivot_longer()` and `pivot_wider()` https://tidyr.tidyverse.org/#related-work https://tidyr.tidyverse.org/articles/pivot.html – M.Viking Jun 29 '21 at 21:51

2 Answers2

2

The casting formula of dcast() allows for a special variable ... which represents all other variables not used in the formula. So,

library(reshape2)
dcast(dat, ... ~ X4 + X5, value.var = "X6")

should return the expected result:

   X1 X2 X3 A_13 B_11 B_12 B_13 C_11 C_13 D_12
1   a  j  Y   NA   NA   NA   NA   NA   NA  201
2   b  v  N   NA   NA   NA   NA  201   NA   NA
3   d  u  E   NA   NA   NA   NA  201   NA   NA
4   g  u  I  201   NA   NA   NA   NA   NA   NA
5   k  j  E   NA   NA   NA  201   NA   NA   NA
6   n  g  B   NA   NA  200   NA   NA   NA   NA
7   r  i  J   NA   NA  201   NA   NA   NA   NA
8   s  o  Y   NA   NA   NA   NA   NA  201   NA
9   w  n  E   NA  201   NA   NA   NA   NA   NA
10  y  a  U   NA   NA   NA   NA   NA   NA  200

By the way, enhanced versions of the melt() and dcast() functions are also available in the data.table package.

Uwe
  • 41,420
  • 11
  • 90
  • 134
0

You can unite X4 and X5 columns and use pivot_wider.

library(dplyr)
library(tidyr)

dat %>%
  unite(col, X4, X5) %>%
  pivot_wider(names_from = col, values_from = X6)

#   X1    X2    X3     D_12  C_11  A_13  B_11  B_13  B_12  C_13
#   <chr> <chr> <chr> <int> <int> <int> <int> <int> <int> <int>
# 1 y     a     U       200    NA    NA    NA    NA    NA    NA
# 2 d     u     E        NA   201    NA    NA    NA    NA    NA
# 3 g     u     I        NA    NA   201    NA    NA    NA    NA
# 4 a     j     Y       201    NA    NA    NA    NA    NA    NA
# 5 b     v     N        NA   201    NA    NA    NA    NA    NA
# 6 w     n     E        NA    NA    NA   201    NA    NA    NA
# 7 k     j     E        NA    NA    NA    NA   201    NA    NA
# 8 n     g     B        NA    NA    NA    NA    NA   200    NA
# 9 r     i     J        NA    NA    NA    NA    NA   201    NA
#10 s     o     Y        NA    NA    NA    NA    NA    NA   201 
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213