3

I would like to put the dataframe in the wide format considering two variables as criteria (maybe even unnecessary). But I comment on this because the original df is 480 rows and several sub-levels.

This is returning an error!

library(tidyr)
library(dplyr)
                                                                
df <- structure(list(ID = c(1, 2, 3, 4), Gender = c("Men", "Women", "Men", 
"Women"), Country = c("Austria", "Austria", "Austria", "Austria"
), Season_ID = c("2011", "2012", "2011", "2012"), Region_UN = c("A", 
"B", "A", "B")), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"))

df_wide <- df %>%
  pivot_wider(names_from = Gender,
              values_from = Region_UN,
              id_cols = c(Country, Season_ID))

Warning message: Values are not uniquely identified; output will contain list-cols.

  • Use values_fn = list to suppress this warning.
  • Use values_fn = length to identify where the duplicates arise
  • Use values_fn = {summary_fun} to summarise duplicates

I don't know which argument I could put in values_fn!

Cristiano
  • 233
  • 1
  • 9

2 Answers2

6

You can also paste it together:

df_wide <- df %>%
  pivot_wider(names_from = Gender,
              values_from = Region_UN,
              id_cols = c(Country, Season_ID),
              values_fn = function(x) paste(x, collapse=","))

df_wide

and as both are the same also:

df_wide <- df %>%
  pivot_wider(names_from = Gender,
              values_from = Region_UN,
              id_cols = c(Country, Season_ID),
              values_fn = first)
df_wide    
tpetzoldt
  • 5,338
  • 2
  • 12
  • 29
3

We can create a sequence column

library(dplyr)
library(tidyr)
library(data.table)
df %>% 
  mutate(ID = NULL, rn = rowid(Country, Season_ID)) %>%     
  pivot_wider(names_from = Gender,
          values_from = Region_UN,
          id_cols = c(rn, Country, Season_ID))
akrun
  • 874,273
  • 37
  • 540
  • 662