R - Convert and transpose data to columns by group

Question

I would greatly appreciate assistance from an R expert after struggling to solve this challenge for a few hours by trying tidyr, reshape, spread, etc.

With a data frame of varying groups and assigned values, is there a way to convert and transpose the data frame such that each group is assigned to a new column and all of the assigned values are listed under the group?

Here is some sample code for a data frame:

a <- c("Group1", "Group1", "Group1", "Group2", "Group2", "Group2", "Group2", "Group2", "Group3")
b <- c("Item1", "Item2", "Item3", "Item4", "Item5", "Item6", "Item7", "Item8", "Item9")

With these two columns, create a new column for each group.

Below, I am showing this manually but would need R to automate this next step.

I manually included "--" to avoid the error "Error in data.frame: arguments imply differing number of rows". In practice, I would not be able to add "--" for each group.

Group1 <- c("Item1", "Item2", "Item3", "--", "--")
Group2 <- c("Item4", "Item5", "Item6", "Item7", "Item8")
Group3 <- c("Item9", "--", "--", "--", "--")

Below, this is the output that I am trying to create.

table <- data.frame(Group1, Group2, Group3)

The challenge is that the variables have to be dynamic. The number of groups and number of items will change for different data sets, and I am unable to manually "--" for the blanks in each group.

The question is similar to this one, except my question involves dynamic ranges. Convert data frame common rows to columns

The linked duplicate answer does not address the situation here, with the "arguments imply differing number of rows" error. — Brian Stamper, Jul 06 '18 at 14:28

Maurits Evers · Accepted Answer · 2018-07-06T12:45:49.860

We can use tidyr::spread

library(tidyverse)
df %>% group_by(a) %>% mutate(n = 1:n()) %>% spread(a, b) %>% select(-n)
## A tibble: 5 x 3
#  Group1 Group2 Group3
#  <fct>  <fct>  <fct>
#1 Item1  Item4  Item9
#2 Item2  Item5  NA
#3 Item3  Item6  NA
#4 NA     Item7  NA
#5 NA     Item8  NA

Or if you prefer "--" instead of NA you can do (thanks @AntoniosK)

df %>%
    group_by(a) %>%
    mutate(n = 1:n()) %>%
    spread(a, b) %>%
    select(-n) %>%
    mutate_all(~ifelse(is.na(.), "--", as.character(.)))
## A tibble: 5 x 3
#  Group1 Group2 Group3
#  <chr>  <chr>  <chr>
#1 Item1  Item4  Item9
#2 Item2  Item5  --
#3 Item3  Item6  --
#4 --     Item7  --
#5 --     Item8  --

or using tidyr::spreads fill argument

df %>%
    mutate_if(is.factor, as.character) %>%
    group_by(a) %>%
    mutate(n = 1:n()) %>%
    spread(a, b, fill = "--") %>%
    select(-n)

giving the same result.

Sample data

a <- c("Group1", "Group1", "Group1", "Group2", "Group2", "Group2", "Group2", "Group2", "Group3")
b <- c("Item1", "Item2", "Item3", "Item4", "Item5", "Item6", "Item7", "Item8", "Item9")
df <- data.frame(a = a, b = b)

R - Convert and transpose data to columns by group

1 Answers1

Sample data