Count number of rows where a value appears in any of two columns in R

Question

I have a dataset like this:

data <- read.csv(text = "foo,bar
a,b
a,a
b,c
c,a
c,b")

I want to compute a table that tells me in how many rows every possible value appears, so something like this:

Value	Count
a	3
b	3
c	3

I've tried grouping by the two columns using dplyr and then summarise, but that does not give you a count per value, but per column value. Any idea?

Is the white space in the left column made by intention? – GKi Apr 20 '23 at 13:20 — GKi, Apr 20 '23 at 13:20
@GKi, no, it was not. Thanfully, it was edited away – jjmerelo Apr 21 '23 at 11:41 — jjmerelo, Apr 21 '23 at 11:41

score 6 · Accepted Answer · edited Apr 20 '23 at 16:06

6

With dplyr, tibble and tidyr, you could do:

data %>%
 mutate(across(everything(), trimws)) %>%
 rowid_to_column() %>%
 pivot_longer(-rowid) %>%
 group_by(value) %>%
 summarise(n = n_distinct(rowid))

  value     n
  <chr> <int>
1 a         3
2 b         3
3 c         3

edited Apr 20 '23 at 16:06

jjmerelo

22,578
8
40
86

answered Apr 20 '23 at 13:08

tmfmnk

38,881
4
47
67

GKi · Answer 2 · 2023-04-20T13:55:44.170

5

Get unique per row, unlist the result and use table to get the counts.

table(unlist(apply(data, 1, unique)))
#a b c 
#3 3 3

Or the same but using pipes.

apply(data, 1, unique) |> unlist() |> table()
#a b c 
#3 3 3

edited Apr 20 '23 at 13:55

answered Apr 20 '23 at 13:03

GKi

37,245
2
26
48

score 5 · Answer 3 · answered Apr 20 '23 at 13:09

read.csv(text="foo,bar
 a,b
 a,a
 b,c
 c,a
 c,b", strip.white = TRUE) |>
  mutate(id = row_number()) |>
  pivot_longer(cols = -id) |>
  distinct(id, value) |>
  count(value)
# # A tibble: 3 × 2
#   value     n
#   <chr> <int>
# 1 a         3
# 2 b         3
# 3 c         3

score 2 · Answer 4 · answered Apr 20 '23 at 13:21

2

Another option:

library(dplyr)
library(tidyr)

data %>% 
  mutate(bar1 = ifelse(foo==bar, NA_character_, bar)) %>% 
  pivot_longer(-bar, values_drop_na = TRUE) %>% 
  count(value)

  value     n
  <chr> <int>
1 a         3
2 b         3
3 c         3

answered Apr 20 '23 at 13:21

TarJae

72,363
6
19
66

score 2 · Answer 5 · answered Apr 20 '23 at 13:51

2

You can use table + rowSums

> rowSums(table(unlist(data), c(row(data))) > 0)
a b c
3 3 3

Data

> dput(data)
structure(list(foo = c("a", "a", "b", "c", "c"), bar = c("b", 
"a", "c", "a", "b")), class = "data.frame", row.names = c(NA,
-5L))

answered Apr 20 '23 at 13:51

ThomasIsCoding

96,636
9
24
81

Count number of rows where a value appears in any of two columns in R

5 Answers5

Data