R: Count frequency of values in nested list with sub-elements

Question

I have a nested list that contains country names. I want to count the frequency of the countries, whereby +1 is added with each mention in a sub-list (regardless of how often the country is mentioned in that sub-list).

For instance, if I have this list:

[[1]]
[1] "Austria" "Austria" "Austria"

[[2]]
[1] "Austria" "Sweden"

[[3]]
[1] "Austria" "Austria" "Sweden"  "Sweden" "Sweden" "Sweden"

[[4]]
[1] "Austria" "Austria" "Austria"

[[5]]
[1] "Austria" "Japan"

... then I would like the result to be like this:

country        freq
====================
Austria         5
Sweden          2
Japan           1

I have tried various ways with lapply, unlist, table, etc. but nothing worked the way I would need it. I would appreciate your help!

s_baldur · Accepted Answer · 2020-10-27T09:28:24.677

One way with lapply(), unlist() and table():

count <- table(unlist(lapply(lst, unique)))
count
# Austria   Japan  Sweden 
#       5       1       2 


as.data.frame(count)
#      Var1 Freq
# 1 Austria    5
# 2   Japan    1
# 3  Sweden    2

Reproducible data (please provide yourself next time):

lst <- list(
  c('Austria', 'Austria', 'Austria'), 
  c("Austria", "Sweden"), 
  c("Austria", "Austria", "Sweden", "Sweden", "Sweden", "Sweden"), 
  c("Austria", "Austria", "Austria"), 
  c("Austria", "Japan")
)

score 3 · Answer 2 · answered Oct 27 '20 at 09:41

3

Here is another base R option

colSums(
  do.call(
    rbind,
    lapply(
      lst,
      function(x) table(factor(x, levels = unique(unlist(lst)))) > 0
    )
  )
)

which gives

Austria  Sweden   Japan
      5       2       1

answered Oct 27 '20 at 09:41

ThomasIsCoding

96,636
9
24
81

score 2 · Answer 3 · answered Oct 27 '20 at 08:47

One way would be to get data in dataframe format and count unique elements where each country occurs.

library(dplyr)

tibble::enframe(lst) %>%
  tidyr::unnest(value) %>%
  group_by(value) %>%
  summarise(freq = n_distinct(name))


# value    freq
#  <chr>   <int>
#1 Austria     5
#2 Japan       1
#3 Sweden      2

data

lst <- list(c('Austria', 'Austria', 'Austria'), c("Austria", "Sweden"), 
     c("Austria", "Austria", "Sweden",  "Sweden", "Sweden", "Sweden"), 
     c("Austria", "Austria", "Austria"), c("Austria", "Japan" ))

score 2 · Answer 4 · answered Oct 27 '20 at 23:19

2

An option is also to stack into a two column data.frame, then take the unique and apply the table

table(unique(stack(setNames(lst, seq_along(lst))))$values)

#   Austria   Japan  Sweden 
#     5       1       2

answered Oct 27 '20 at 23:19

akrun

874,273
37
540
662

R: Count frequency of values in nested list with sub-elements

4 Answers4