6

I have a dataset like this:

data <- read.csv(text = "foo,bar
a,b
a,a
b,c
c,a
c,b")

I want to compute a table that tells me in how many rows every possible value appears, so something like this:

Value Count
a 3
b 3
c 3

I've tried grouping by the two columns using dplyr and then summarise, but that does not give you a count per value, but per column value. Any idea?

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
jjmerelo
  • 22,578
  • 8
  • 40
  • 86

5 Answers5

6

With dplyr, tibble and tidyr, you could do:

data %>%
 mutate(across(everything(), trimws)) %>%
 rowid_to_column() %>%
 pivot_longer(-rowid) %>%
 group_by(value) %>%
 summarise(n = n_distinct(rowid))

  value     n
  <chr> <int>
1 a         3
2 b         3
3 c         3
jjmerelo
  • 22,578
  • 8
  • 40
  • 86
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
5

Get unique per row, unlist the result and use table to get the counts.

table(unlist(apply(data, 1, unique)))
#a b c 
#3 3 3 

Or the same but using pipes.

apply(data, 1, unique) |> unlist() |> table()
#a b c 
#3 3 3 
GKi
  • 37,245
  • 2
  • 26
  • 48
5
read.csv(text="foo,bar
 a,b
 a,a
 b,c
 c,a
 c,b", strip.white = TRUE) |>
  mutate(id = row_number()) |>
  pivot_longer(cols = -id) |>
  distinct(id, value) |>
  count(value)
# # A tibble: 3 × 2
#   value     n
#   <chr> <int>
# 1 a         3
# 2 b         3
# 3 c         3
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
2

Another option:

library(dplyr)
library(tidyr)

data %>% 
  mutate(bar1 = ifelse(foo==bar, NA_character_, bar)) %>% 
  pivot_longer(-bar, values_drop_na = TRUE) %>% 
  count(value)

  value     n
  <chr> <int>
1 a         3
2 b         3
3 c         3
TarJae
  • 72,363
  • 6
  • 19
  • 66
2

You can use table + rowSums

> rowSums(table(unlist(data), c(row(data))) > 0)
a b c
3 3 3

Data

> dput(data)
structure(list(foo = c("a", "a", "b", "c", "c"), bar = c("b", 
"a", "c", "a", "b")), class = "data.frame", row.names = c(NA,
-5L))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81