3

I would like to know a practical way to transform dat in the table below

dat <- data.frame('city' = c('A','A','B','C','A','B','B','C','C','C'), 
                  'color' = c('red', 'green', 'blue', 'red', 'green', 'blue', 'green', 'blue', 'red', 'red'),
                  'sex' = c('M','F','F','M','F','F','F','M','F','M'))


city red green blue F M
A    1   2     0    2 1
B    0   1     2    3 0
C    3   0     1    1 3
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51

3 Answers3

3

With tidyr, you can pivot_longer and then pivot_wider:

library(tidyr)

dat %>%
  pivot_longer(c(color, sex)) %>%
  pivot_wider(id_cols = city, names_from = c(name, value), names_sort = TRUE,
              values_from = value, values_fn = length, values_fill = 0)

# # A tibble: 3 × 6
#   city  color_blue color_green color_red sex_F sex_M
#   <chr>      <int>       <int>     <int> <int> <int>
# 1 A              0           2         1     2     1
# 2 B              2           1         0     3     0
# 3 C              1           0         3     1     3
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
2

With sapply, create multiple tables and bind them:

sapply(dat[2:3], \(x) as.data.frame.matrix(table(dat$city, x))) |>
  do.call(what = 'cbind.data.frame')

  color.blue color.green color.red sex.F sex.M
A          0           2         1     2     1
B          2           1         0     3     0
C          1           0         3     1     3
Maël
  • 45,206
  • 3
  • 29
  • 67
  • this solution avoids a problem that came up to me, but it was not present in the example I provided, when there is the same category in different variables this code treats them as different information since there is a combination of the category name with the variable name, thanks – Marcelo Rodrigues Feb 01 '23 at 14:52
  • I just need to add the city column to the final result – Marcelo Rodrigues Feb 01 '23 at 14:53
1

You could first convert to longer format and then count values per group and convert back to wider format using pivot_wider like this:

library(dplyr)
library(tidyr)
dat %>%
  pivot_longer(cols = c(color, sex)) %>%
  group_by(city) %>%
  add_count(value) %>%
  distinct() %>%
  select(-name) %>%
  pivot_wider(names_from = value, values_from = n, values_fill = 0)
#> # A tibble: 3 × 6
#> # Groups:   city [3]
#>   city    red     M green     F  blue
#>   <chr> <int> <int> <int> <int> <int>
#> 1 A         1     1     2     2     0
#> 2 B         0     0     1     3     2
#> 3 C         3     3     0     1     1

Created on 2023-01-31 with reprex v2.0.2

Quinten
  • 35,235
  • 5
  • 20
  • 53