17
df <- data.frame(animal = c("dog", "dog", "cat", "dog", "cat", "cat"),
                 hunger = c(0, 1, 1, 0, 1,1))

I have a dataframe like the one above with two columns, one containing categories and the other containing binary data.

I am looking to reshape the dataframe to split the category ("animal") column up into two columns of its own with the values of "animal" column as column names and the values of the other column (hunger) as cell values, i.e.

Desired output:

df <- data.frame(dog = c(0, 1, 0),
                 cat = c(1, 1, 1))

How can I achieve this?

TylerH
  • 20,799
  • 66
  • 75
  • 101
Icewaffle
  • 443
  • 2
  • 13
  • 1
    What would you do if `df` was 7 rows and `dog` and `cat` weren't equal length? (Also, as an aside, I don't think this is a very good plan because the data structure is not very robust.) – Ian Campbell Mar 17 '23 at 20:38

7 Answers7

11

In the case of uneven length among different categories, we can use

list2DF(
  lapply(
    . <- unstack(df, hunger ~ animal),
    `length<-`,
    max(lengths(.))
  )
)

or

list2DF(
  lapply(
    . <- unstack(rev(df)),
    `length<-`,
    max(lengths(.))
  )
)

and we will obtain

  cat dog
1   1   0
2   1   1
3   1   0
4   0  NA

Dummy data

df <- data.frame(
  animal = c("dog", "dog", "cat", "dog", "cat", "cat", "cat"),
  hunger = c(0, 1, 1, 0, 1, 1, 0)
)

We can also use unstack, e.g.,

> unstack(rev(df))
  cat dog
1   1   0
2   1   1
3   1   0

or

> unstack(df, hunger ~ animal)
  cat dog
1   1   0
2   1   1
3   1   0
TylerH
  • 20,799
  • 66
  • 75
  • 101
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
  • 1
    I think 2nd version, without rev, should be the one at the top. – zx8754 Mar 17 '23 at 21:29
  • Great solution, how would you go about turning it into a dataframe if they are of uneven length as Ian suggests? I.e. df <- data.frame(animal = c("dog", "cat", "dog", "cat", "cat"), hunger = c(1, 1, 0, 1,1)) – Icewaffle Mar 17 '23 at 22:17
  • @Icewaffle what's the desired output in that case, i.e., uneven length? – ThomasIsCoding Mar 17 '23 at 22:19
  • Desired output would be even length with NA filling in bottom rows of smaller column – Icewaffle Mar 18 '23 at 09:32
9

Base R:

df$id <- ave(df$hunger, df$animal, FUN = seq_along)
reshape(df, idvar = "id", timevar = "animal", direction = "wide")[, -1]

  hunger.dog hunger.cat
1          0          1
2          1          1
4          0          1
akrun
  • 874,273
  • 37
  • 540
  • 662
TarJae
  • 72,363
  • 6
  • 19
  • 66
9

Using split:

data.frame(split(df$hunger, df$animal))
#   cat dog
# 1   1   0
# 2   1   1
# 3   1   0
zx8754
  • 52,746
  • 12
  • 114
  • 209
8

Using data.table

library(data.table)
dcast(setDT(df), rowid(animal) ~ animal)[, animal  := NULL][]

-output

    cat dog
1:   1   0
2:   1   1
3:   1   0
akrun
  • 874,273
  • 37
  • 540
  • 662
7

You could use pivot_wider by first creating an id for each group to identify the duplicates and use the names_from and values_from like this:

library(dplyr)
library(tidyr)
df %>%
  group_by(animal) %>%
  mutate(id = row_number()) %>%
  pivot_wider(names_from = animal, values_from = hunger) %>%
  select(-id)
#> # A tibble: 3 × 2
#>     dog   cat
#>   <dbl> <dbl>
#> 1     0     1
#> 2     1     1
#> 3     0     1

Created on 2023-03-17 with reprex v2.0.2

Quinten
  • 35,235
  • 5
  • 20
  • 53
  • 1
    This is exactly how I would have done it. I would have lovely implemented this one also `df %>% pivot_wider(names_from = animal, values_from = hunger, values_fill = 0)` but it gives error `Error in `pivot_wider()`: ! Can't convert `fill` to .` – TarJae Mar 17 '23 at 21:04
  • 2
    Hi @TarJae, I tried that also at first but unfortunately that doesn’t work. – Quinten Mar 17 '23 at 21:25
7

A tidy framework way

library(dplyr)
library(tidyr)

df |> 
  pivot_wider(names_from = animal, values_from = hunger, values_fn = list) |> 
  unnest(cols = c("dog", "cat"))

Base R

do.call(cbind.data.frame, tapply(df$hunger, df$animal, `+`))
Just James
  • 1,222
  • 2
  • 7
7

Throwing a tidyverse/purrr solution into the mix:

library(tidyverse)

df <- data.frame(animal = c("dog", "dog", "cat", "dog", "cat", "cat"),
                 hunger = c(0, 1, 1, 0, 1,1))

df %>% 
  group_split(animal) %>% 
  map(~tibble(!!quo_name(unique(.x$animal)) := .x$hunger)) %>% 
  list_cbind()
  
#> # A tibble: 3 × 2
#>     cat   dog
#>   <dbl> <dbl>
#> 1     1     0
#> 2     1     1
#> 3     1     0
Matt
  • 7,255
  • 2
  • 12
  • 34