1

I have the following dataframe in R

  DF2<-data.frame("ID"=c("A", "A", "A", "B", "B", "B", "B", 'B'), 
  'Freq'=c(1,2,3,1,2,3,4,5), "Val"=c(1,2,4, 2,3,4,5,8))

The datframe has the following appearance

   ID Freq Val
1  A    1   1
2  A    2   2
3  A    3   4
4  B    1   2
5  B    2   3
6  B    3   4
7  B    4   5
8  B    5   8

I want to melt and recast the dataframe to yield the following dataframe

   A_Freq A_Value B_Freq B_Value
1      1       1      1       2
2      2       2      2       3
3      3       4      3       4
4     NA      NA      4       5
5     NA      NA      5       8

I have tried the following code

 DF3<-melt(DF2, by=ID)
 DF3$ID<-paste0(DF3$ID, DF3$variable)
 DF3$variable<-NULL
 DF4<-dcast(DF3, value~ID)

This yields the following dataframe

     value AFreq AVal BFreq BVal
 1     1     1    1     1   NA
 2     2     2    2     2    2
 3     3     3   NA     3    3
 4     4    NA    4     4    4
 5     5    NA   NA     5    5
 6     8    NA   NA    NA    8

How can I obtain the above result. I have tried other variations of dcast but am unable to obtain the desired result. request someone to help

Raghavan vmvs
  • 1,213
  • 1
  • 10
  • 29

1 Answers1

1

One option would be

library(tidyverse)
DF2 %>% 
    gather(key, val, -ID) %>%
    unite(IDkey, ID, key) %>% 
    group_by(IDkey) %>%
    mutate(rn = row_number()) %>% 
    spread(IDkey, val) %>%
    select(-rn)
# A tibble: 5 x 4
#  A_Freq A_Val B_Freq B_Val
#   <dbl> <dbl>  <dbl> <dbl>
#1      1     1      1     2
#2      2     2      2     3
#3      3     4      3     4
#4     NA    NA      4     5
#5     NA    NA      5     8

Or using melt/dcast. We melt, by specifying the id.var as "ID" (as a string) to convert from 'wide' to 'long' format. Then using dcast, reshape from 'long' to 'wide' with the expression rowid(ID, variable) ~ paste(ID, variable, sep="_"). The rhs of ~ paste the column values together, while rowid get the sequence id for the ID, variable columns.

library(data.table)
dcast(melt(setDT(DF2), id.var = "ID"), rowid(ID, variable) ~ 
     paste(ID, variable, sep="_"))[, ID := NULL][]
#   A_Freq A_Val B_Freq B_Val
#1:      1     1      1     2
#2:      2     2      2     3
#3:      3     4      3     4
#4:     NA    NA      4     5
#5:     NA    NA      5     8

In the OP's code, the expression is value ~ ID, so it create a column 'value' with each unique element of 'value' and at the same time, automatically picks up the value.var as 'value' resulting in more rows than expected

akrun
  • 874,273
  • 37
  • 540
  • 662