Reshape data to split column values into columns

Question

df <- data.frame(animal = c("dog", "dog", "cat", "dog", "cat", "cat"),
                 hunger = c(0, 1, 1, 0, 1,1))

I have a dataframe like the one above with two columns, one containing categories and the other containing binary data.

I am looking to reshape the dataframe to split the category ("animal") column up into two columns of its own with the values of "animal" column as column names and the values of the other column (hunger) as cell values, i.e.

Desired output:

df <- data.frame(dog = c(0, 1, 0),
                 cat = c(1, 1, 1))

How can I achieve this?

What would you do if `df` was 7 rows and `dog` and `cat` weren't equal length? (Also, as an aside, I don't think this is a very good plan because the data structure is not very robust.) — Ian Campbell, Mar 17 '23 at 20:38

score 11 · Accepted Answer · edited Mar 23 '23 at 13:49

11

In the case of uneven length among different categories, we can use

list2DF(
  lapply(
    . <- unstack(df, hunger ~ animal),
    `length<-`,
    max(lengths(.))
  )
)

or

list2DF(
  lapply(
    . <- unstack(rev(df)),
    `length<-`,
    max(lengths(.))
  )
)

and we will obtain

Dummy data

df <- data.frame(
  animal = c("dog", "dog", "cat", "dog", "cat", "cat", "cat"),
  hunger = c(0, 1, 1, 0, 1, 1, 0)
)

We can also use unstack, e.g.,

> unstack(rev(df))
  cat dog
1   1   0
2   1   1
3   1   0

or

> unstack(df, hunger ~ animal)
  cat dog
1   1   0
2   1   1
3   1   0

edited Mar 23 '23 at 13:49

TylerH

20,799
66
75
101

answered Mar 17 '23 at 21:20

ThomasIsCoding

96,636
9
24
81

1

I think 2nd version, without rev, should be the one at the top. – zx8754 Mar 17 '23 at 21:29
Great solution, how would you go about turning it into a dataframe if they are of uneven length as Ian suggests? I.e. df <- data.frame(animal = c("dog", "cat", "dog", "cat", "cat"), hunger = c(1, 1, 0, 1,1)) – Icewaffle Mar 17 '23 at 22:17
@Icewaffle what's the desired output in that case, i.e., uneven length? – ThomasIsCoding Mar 17 '23 at 22:19
Desired output would be even length with NA filling in bottom rows of smaller column – Icewaffle Mar 18 '23 at 09:32

score 9 · Answer 2 · edited Mar 18 '23 at 22:25

9

Base R:

df$id <- ave(df$hunger, df$animal, FUN = seq_along)
reshape(df, idvar = "id", timevar = "animal", direction = "wide")[, -1]

  hunger.dog hunger.cat
1          0          1
2          1          1
4          0          1

edited Mar 18 '23 at 22:25

akrun

874,273
37
540
662

answered Mar 17 '23 at 21:02

TarJae

72,363
6
19
66

score 9 · Answer 3 · answered Mar 17 '23 at 21:05

9

Using split:

data.frame(split(df$hunger, df$animal))
#   cat dog
# 1   1   0
# 2   1   1
# 3   1   0

answered Mar 17 '23 at 21:05

zx8754

52,746
12
114
209

score 8 · Answer 4 · answered Mar 17 '23 at 23:02

8

Using data.table

library(data.table)
dcast(setDT(df), rowid(animal) ~ animal)[, animal  := NULL][]

-output

    cat dog
1:   1   0
2:   1   1
3:   1   0

answered Mar 17 '23 at 23:02

akrun

874,273
37
540
662

score 7 · Answer 5 · answered Mar 17 '23 at 20:18

7

You could use pivot_wider by first creating an id for each group to identify the duplicates and use the names_from and values_from like this:

library(dplyr)
library(tidyr)
df %>%
  group_by(animal) %>%
  mutate(id = row_number()) %>%
  pivot_wider(names_from = animal, values_from = hunger) %>%
  select(-id)
#> # A tibble: 3 × 2
#>     dog   cat
#>   <dbl> <dbl>
#> 1     0     1
#> 2     1     1
#> 3     0     1

^{Created on 2023-03-17 with reprex v2.0.2}

answered Mar 17 '23 at 20:18

Quinten

35,235
5
20
53

1

This is exactly how I would have done it. I would have lovely implemented this one also `df %>% pivot_wider(names_from = animal, values_from = hunger, values_fill = 0)` but it gives error `Error in `pivot_wider()`: ! Can't convert `fill` to .` – TarJae Mar 17 '23 at 21:04
2

Hi @TarJae, I tried that also at first but unfortunately that doesn’t work. – Quinten Mar 17 '23 at 21:25

score 7 · Answer 6 · answered Mar 17 '23 at 20:19

7

A tidy framework way

library(dplyr)
library(tidyr)

df |> 
  pivot_wider(names_from = animal, values_from = hunger, values_fn = list) |> 
  unnest(cols = c("dog", "cat"))

Base R

do.call(cbind.data.frame, tapply(df$hunger, df$animal, `+`))

answered Mar 17 '23 at 20:19

Just James

1,222
2
7

score 7 · Answer 7 · answered Mar 17 '23 at 20:26

Throwing a tidyverse/purrr solution into the mix:

library(tidyverse)

df <- data.frame(animal = c("dog", "dog", "cat", "dog", "cat", "cat"),
                 hunger = c(0, 1, 1, 0, 1,1))

df %>% 
  group_split(animal) %>% 
  map(~tibble(!!quo_name(unique(.x$animal)) := .x$hunger)) %>% 
  list_cbind()
  
#> # A tibble: 3 × 2
#>     cat   dog
#>   <dbl> <dbl>
#> 1     1     0
#> 2     1     1
#> 3     1     0

Reshape data to split column values into columns

7 Answers7

Linked