using r to count character occurrences in multiple columns of data.frame

Question

I'm new to R and have a data.frame with 100 columns. Each column is character data and I am trying to make a summary of how many times a character shows up in each column. I would like to be able to make a summary of all the columns at once without having to type in code for each column. I've tried

occurrences <- table(unlist(my_df))

but this table gives me a summary of all columns combined (not a summary for each column.

When I make a summary for one column my output looks how I want but only for that one column:

BG_occurrences <- table(unlist(my_df$G))

   1   na SOME 
  17   20    1

Is there a way to code and get a summary of all data in each column all at once? I want the output to look something like this:

     1   na   SOME
BG:   17   20   1
sBG:  23   10   5
BX:   18   20   0
NG:   21   11   6

score 0 · Accepted Answer · answered Jan 28 '21 at 02:30

0

We can use lapply/sapply to loop over the columns and apply the table

lapply(my_df, table)

Or it can be done in a vectorized way

table(c(col(my_df)), unlist(my_df))

Or with tidyverse

library(dplyr)
library(tidyr)
my_df %>%
   pivot_longer(cols = everything()) %>%
   count(name, value)

answered Jan 28 '21 at 02:30

akrun

874,273
37
540
662

Thank you for the help! Is it possible to select certain columns from my dataframe to make a table in the vectorized format? I tried `mini_df <- table(c(col(my_df$BG:my_EF$BX)), unlist(my_df$BG:my_df1$BX))` but this did not work and I get an error saying 'Error in my_df$BG:my_df$BX : NA/NaN argument' I think because some of the data in my_df is NA. Is there a way to get around this? – clions226 Jan 31 '21 at 20:55
@clions226 You can use `subset` with `select` i..e `subset(my_df, select = BG:BX)` – akrun Feb 01 '21 at 11:42

using r to count character occurrences in multiple columns of data.frame

1 Answers1