How do I transform this dataframe of stacked variables to a dataframe with one variable per column in R? (Long to Wide)

Question

I apologize if this is a duplicate but I can't seem to make any solutions I've found work for my problem. My problem seems simple, but I just can't seem to make anything give me the output I need.

I have a dataframe with all of the variables stacked like so:

Variable	Old_Value	New Value
Var1	A1	A2
Var1	A3	A4
Var1	A5	A6
Var2	B1	B2
Var2	B3	B4
Var2	B5	B6

And I just want each Variable in its own column, like so:

Var1_Old	Var1_New	Var2_Old	Var2_New
A1	A2	B1	B2
A3	A4	B3	B4
A5	A6	B5	B6

I've played around with spread, dcast, pivot_wider, but am running into issues because I don't have a key value. I'm sure I'm missing something simple....

Thank you for your help!

Does this answer your question? [How to reshape data from long to wide format](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format) — deschen, Mar 23 '21 at 18:14
Is there any reason why the first result row is `A1, A2, B1, B2` instead of `A1, A2, B3, B4`? Your transformation seems to have some sort of implied link between (in the original data) the first row of `Var1` and the first row of `Var2` - which normally would be coded as a key value. My question here is trying to understand whether that link is deliberate or if it doesn't really matter. Your example also seems to assume that there are an equal number of `Var1` rows and `Var2` rows - is this the case? The simplest solution is probably to just create a key value column... — Gregor Thomas, Mar 23 '21 at 18:16
Gregor - To answer your questions, 1. There is no implied link between the first row of Var 1 and the first row of Var2. I am simply trying to unstack variable values. And 2. The number of rows per var could change, i.e., there could be multiple values for each var that go from old to new. Thank you for your help! — Garen Pledge, Mar 23 '21 at 18:23
deschen - Thank you for the suggestion! I looked at that example and couldn't make it work because there is no key value that ties everything together. But like I mentioned in my post, I feel like I'm missing something simple. — Garen Pledge, Mar 23 '21 at 18:24

score 0 · Accepted Answer · answered Mar 23 '21 at 18:25

0

This is how I would have done it:

library(tibble)
library(tidyr)
tibble(
  variable = paste0('Var', rep(1:2, each = 3)),
  old_value = c(paste0(rep(c('A', 'B'), each = 3), c(1,3,5))),
  new_value = c(paste0(rep(c('A', 'B'), each = 3), c(2,4,6))),
) %>% 
  gather('k', 'v', -1) %>% 
  unite('tmp', variable, k) %>%
  mutate(
    tmp2 = rep(1:3, 4)
  ) %>% 
  spread(tmp, v) %>% 
  select(-tmp2)

Output:

# A tibble: 3 x 4
  Var1_new_value Var1_old_value Var2_new_value Var2_old_value
  <chr>          <chr>          <chr>          <chr>         
1 A2             A1             B2             B1            
2 A4             A3             B4             B3            
3 A6             A5             B6             B5

answered Mar 23 '21 at 18:25

Baraliuh

593
3
12

Thank you so much! In your solution Tmp2 becomes the key value to use spread. Makes sense. Can I ask a quick follow up? Would there be an easy way to modify tmp2 if Var1 and Var2 had a different number of values? Basically instead of tmp2 taking on values between 1 and 3, tmp2 would take on values of 1 to x where x is the max number of rows per Var? – Garen Pledge Mar 23 '21 at 18:51
The spread variable is tmp, tmp2 is used to make rows unique. If var1 or var2 have different lengths, the code needs to be somewhat modified so that tmp2 gives a unique integer for each var1 and var2. This will introduce NA values in the final output. – Baraliuh Mar 23 '21 at 18:56
Got it, thanks for the clarification and help! – Garen Pledge Mar 23 '21 at 19:01
FYI, `spread` and `gather` have been superseded by `pivot_wider` and `pivot_longer`. – LMc Mar 23 '21 at 19:33
Yeah, an old habit I guess. Though, I usually find `spread` to be faster than `pivot_longer` maybe just because I am not as used to it. – Baraliuh Mar 24 '21 at 17:21

How do I transform this dataframe of stacked variables to a dataframe with one variable per column in R? (Long to Wide)

1 Answers1