dcasting long table along multiple columns

Question

I am having a difficult time understanding dcast, and cannot get the right commands to get what I want. I will give a minimal repro.

#generate the data
ID <- c('a','a','a','b','b','b')
Parameter <- c('p1','p2','p3','p1','p2','p3')
Value <- c('yes','no','3','yes','yes','2')
Comment <- c(NA,'Deduced','To verify',NA,'Deduced','Verified')
Source <- c('Exp.1','Exp.1','Exp.1+2','DB2','DB2','DB2')
Person <- c('X','X','X','Y','Y','Z')
long.data <- data.frame(ID,Parameter,Value,Comment,Source,Person)

  ID Parameter Value   Comment  Source Person
1  a        p1   yes      <NA>   Exp.1      X
2  a        p2    no   Deduced   Exp.1      X
3  a        p3     3 To verify Exp.1+2      X
6  b        p1   yes      <NA>     DB2      Y
7  b        p2   yes   Deduced     DB2      Y
8  b        p3     2  Verified     DB2      Y

I want to turn this into the following wide.data format:

  ID Person  p1 p1-Comment p1-Source   p2 p2-Comment p2-Source p3 p3-Comment p3-Source
1  a      X yes       <NA>     Exp.1   no    Deduced     Exp.1  3  To verify   Exp.1+2
2  b      Y yes       <NA>       DB2  yes    Deduced       DB2  2   Verified       DB2

I can assume that every ID has the same Person. I believed that I could dcast this, but I have not figured out a reasonable way to do this that doesn't output garbage columns. There is probably a relatively straightforward way to do this that I am just missing.

Please post the code for what you attempted. Also. the reshape2 package has been deprecated, check out tidyr and `pivot_wider()`. — L Tyrone, Mar 18 '23 at 01:52

akrun · Accepted Answer · 2023-03-18T04:38:43.480

We could use pivot_wider

library(tidyr)
library(dplyr)
pivot_wider(long.data, names_from = Parameter,
   values_from = c(Value, Comment, Source ), 
     names_glue = "{Parameter}-{.value}", names_vary = "slowest")) %>% 
   filter(!is.na(`p1-Value`))

-output

# A tibble: 2 × 11
  ID    Person `p1-Value` `p1-Comment` `p1-Source` `p2-Value` `p2-Comment` `p2-Source` `p3-Value` `p3-Comment` `p3-Source`
  <chr> <chr>  <chr>      <chr>        <chr>       <chr>      <chr>        <chr>       <chr>      <chr>        <chr>      
1 a     X      yes        <NA>         Exp.1       no         Deduced      Exp.1       3          To verify    Exp.1+2    
2 b     Y      yes        <NA>         DB2         yes        Deduced      DB2         <NA>       <NA>         <NA>

score 0 · Answer 2 · answered Mar 18 '23 at 04:19

Learn reshape.

reshape(long.data, timevar='Parameter', idvar='ID', direction='wide')
#   ID Value.p1 Comment.p1 Source.p1 Person.p1 Value.p2 Comment.p2 Source.p2 Person.p2 Value.p3 Comment.p3 Source.p3 Person.p3
# 1  a      yes       <NA>     Exp.1         X       no    Deduced     Exp.1         X        3  To verify   Exp.1+2         X
# 4  b      yes       <NA>       DB2         Y      yes    Deduced       DB2         Y        2   Verified       DB2         Z

Data:

long.data <- structure(list(ID = c("a", "a", "a", "b", "b", "b"), Parameter = c("p1", 
"p2", "p3", "p1", "p2", "p3"), Value = c("yes", "no", "3", "yes", 
"yes", "2"), Comment = c(NA, "Deduced", "To verify", NA, "Deduced", 
"Verified"), Source = c("Exp.1", "Exp.1", "Exp.1+2", "DB2", "DB2", 
"DB2"), Person = c("X", "X", "X", "Y", "Y", "Z")), class = "data.frame", row.names = c(NA, 
-6L))

When I attempt this code on my real data, I get a columns which are named: Value.c("parameter-1","parameter-2","parameter-3", ...) and so on, with NAs as values. I've been unable to reproduce this behavior with the toy dataset. The pivot_wider solution above does not do this. — David Inman, Mar 18 '23 at 08:03

dcasting long table along multiple columns

2 Answers2