pivot_wider not collapsing rows

Question

got a pretty basic question to ask unfortunately, I am trying to use a pivot_wider to make my data into a panel.

variable id reports gp & ge every year, the column t denotes the year. I want a separate variable gp_t and ge_t for every year in the data(i.e. t =[2011 -2013])

When I use pivot_wider, I am getting the right number of columns, but the rows are not collapsing into each other as they have in the past when I have used this function. Can someone please give me a hand? Because I am using t as the "names_from" for 2 different columns, I duplicated it... was this the wrong thing to do?

df$t2<-df$t

df<-df %>%
select(id,t,t2,gp,ge)%>%
group_by(id)%>%
pivot_wider(names_from = "t", names_prefix= "gp_" values_from = "gp")%>%
pivot_wider(names_from = "t2", names_prefix= "ge_" values_from = "ge")

NB: company 01 went out of business in 2013, this needs to be kept visible in the table if possible!

I am currently getting this:

id	gp_2011	gp_2012	gp_2013	ge_2011	ge_2012	ge_2013
01	25	NA	NA	12	NA	NA
01	NA	32	NA	NA	22	NA
02	95	NA	NA	62	NA	NA
02	NA	73	NA	NA	41	NA
02	NA	NA	68	NA	NA	55
03	24	NA	NA	16	NA	NA
03	NA	34	NA	NA	22	NA
03	NA	NA	41	NA	NA	20

I want this:

id	gp_2011	gp_2012	gp_2013	ge_2011	ge_2012	ge_2013
01	25	32	NA	12	22	NA
02	95	73	68	62	41	55
03	24	34	41	16	22	20

Any help or handy hints greatly appreciated!

Thanks!

Can you share the sample input for this output? `dput()` would be the nicest way to share it as it is easily copy/pasteable, `dput(your_sample_data[1:9, ])` for the first 9 rows. — Gregor Thomas, May 20 '22 at 05:21
@GregorThomas I am sorry to say I cannot! The data I am working with is sensitive :( — Gilrob, May 20 '22 at 05:45
Welcome to SO! Please, [edit] your question and provide a [mcve] in order to reproduce your question and to verify the answers are correct. No one asks to disclose your original data but if you expect answers we need something to work with. Thank you. — Uwe, May 20 '22 at 06:38
Surely you can share whatever sample input corresponds to the sample output you have already shared?? It's difficult to help when you only provide output, not input. Looks like Stefan went to the trouble to construct it for you, but in the future please provide sample input to make it easy for people to help you. — Gregor Thomas, May 20 '22 at 17:42
@GregorThomas apologies I didn't really understand your point, (pretty new to this) will keep it in mind for future. — Gilrob, May 23 '22 at 00:32
@Uwe thanks for the help and the link! Duly noted, will get it right next time :) thanks! — Gilrob, May 23 '22 at 00:32
@Gilrob my point is that you already shared some data. You shared only the **output** that you are seeking. Presumably the information you shared in the output is not "sensitive". I wasn't asking you to share your entire sensitive data, just whatever small sample of **input** corresponds to the output that you already shared. When you don't share input, you are making it hard for people to help you: anyone who wants to help has an extra step of reconstructing your input--as Stefan did in the answer. So next time, please share input. — Gregor Thomas, May 23 '22 at 13:50
@GregorThomas understood, thanks for taking the time to explain it to me, will do in future! — Gilrob, May 24 '22 at 01:52

score 1 · Accepted Answer · answered May 20 '22 at 06:50

There is neither need to duplicate your t column nor to use two pivot_wider. Instead you could achieve your desired result like so:

library(dplyr)
library(tidyr)

df %>%
  select(id, t, gp, ge) %>%
  pivot_wider(names_from = "t", values_from = c(gp, ge))
#> # A tibble: 3 × 7
#>   id    gp_2011 gp_2012 gp_2013 ge_2011 ge_2012 ge_2013
#>   <chr>   <int>   <int>   <int>   <int>   <int>   <int>
#> 1 01         25      32      NA      12      22      NA
#> 2 02         95      73      68      62      41      55
#> 3 03         24      34      41      16      22      20

DATA

df_wide <- data.frame(
                id = c("01", "02", "03"),
           gp_2011 = c(25L, 95L, 24L),
           gp_2012 = c(32L, 73L, 34L),
           gp_2013 = c(NA, 68L, 41L),
           ge_2011 = c(12L, 62L, 16L),
           ge_2012 = c(22L, 41L, 22L),
           ge_2013 = c(NA, 55L, 20L)
           )

library(tidyr)

df <- df_wide %>%
  pivot_longer(-id, names_to = c(".value", "t"), names_sep = "_")

Thanks very much for the help @Stefan, this looks much nicer, testing it out now! — Gilrob, May 23 '22 at 00:32

id	gp_2011	gp_2012	gp_2013	ge_2011	ge_2012	ge_2013
01	25	NA	NA	12	NA	NA
01	NA	32	NA	NA	22	NA
02	95	NA	NA	62	NA	NA
02	NA	73	NA	NA	41	NA
02	NA	NA	68	NA	NA	55
03	24	NA	NA	16	NA	NA
03	NA	34	NA	NA	22	NA
03	NA	NA	41	NA	NA	20

id	gp_2011	gp_2012	gp_2013	ge_2011	ge_2012	ge_2013
01	25	NA	NA	12	NA	NA
01	NA	32	NA	NA	22	NA
02	95	NA	NA	62	NA	NA
02	NA	73	NA	NA	41	NA
02	NA	NA	68	NA	NA	55
03	24	NA	NA	16	NA	NA
03	NA	34	NA	NA	22	NA
03	NA	NA	41	NA	NA	20

pivot_wider not collapsing rows

1 Answers1

id	gp_2011	gp_2012	gp_2013	ge_2011	ge_2012	ge_2013
01	25	NA	NA	12	NA	NA
01	NA	32	NA	NA	22	NA
02	95	NA	NA	62	NA	NA
02	NA	73	NA	NA	41	NA
02	NA	NA	68	NA	NA	55
03	24	NA	NA	16	NA	NA
03	NA	34	NA	NA	22	NA
03	NA	NA	41	NA	NA	20