1

Assume that i have two variables. See Dummy data below:

Out of 250 records:

SEX

Male : 100

Female : 150

HAIR

Short : 110

Long : 140

The code i currently use is provided below, For each variable a different table is created:

sexTable <- table(myDataSet$Sex)
hairTable <- table(myDataSet$Hair)

View(sexTable):
|------------------|------------------|
|       Level      |    Frequency     |
|------------------|------------------|
|        Male      |       100        |
|       Female     |       150        |
|------------------|------------------|


View(hairTable)
|------------------|------------------|
|       Level      |    Frequency     |
|------------------|------------------|
|        Short     |       110        |
|        Long      |       140        |
|------------------|------------------|

My question is how to merge the two tables in R that will have the following format As well as to calculate the percentage of frequency for each group of levels:

|---------------------|------------------|------------------|
|      Variables      |       Level      |    Frequency     |
|---------------------|------------------|------------------|
|      Sex(N=250)     |        Male      |       100 (40%)  |
|                     |       Female     |       150 (60%)  |
|      Hair(N=250)    |        Short     |       110 (44%)  |
|                     |        Long      |       140 (56%)  |
|---------------------|------------------|------------------|
Loizos Vasileiou
  • 674
  • 10
  • 37

1 Answers1

1

We can use bind_rows after converting to data.frame

library(dplyr)
bind_rows(list(sex = as.data.frame(sexTable),
       Hair = as.data.frame(hairTable)), .id = 'Variables')

Using a reproducible example

tbl1 <- table(mtcars$cyl)
tbl2 <- table(mtcars$vs)
bind_rows(list(sex = as.data.frame(tbl1), 
       Hair = as.data.frame(tbl2)), .id = 'Variables')%>% 
   mutate(Variables = replace(Variables, duplicated(Variables), ""))

If we also need the percentages

dat1 <- transform(as.data.frame(tbl1), 
  Freq = sprintf('%d (%0.2f%%)', Freq,   as.numeric(prop.table(tbl1) * 100)))
dat2 <- transform(as.data.frame(tbl2), 
  Freq = sprintf('%d (%0.2f%%)', Freq,   as.numeric(prop.table(tbl2) * 100)))
bind_rows(list(sex = dat1, Hair = dat2, .id = 'Variables')
akrun
  • 874,273
  • 37
  • 540
  • 662
  • that's a great suggestion. Thanks the values are returned as expected although after the results are printed the following error occurs: `Warning messages: 1: In bind_rows_(x, .id) : Unequal factor levels: coercing to character 2: In bind_rows_(x, .id) : binding character and factor vector, coercing into character vector 3: In bind_rows_(x, .id) : binding character and factor vector, coercing into character vector`. What it means? Also the Variable should not be repeated when the levels have the same variables. (eg. Male and Female are both in Sex variable as shown in my question) – Loizos Vasileiou Mar 15 '20 at 21:27
  • 1
    @LoizosVasileiou that is just a friendly warning. It happens because one of the dataset have different column type (factor), so it is converted to `character` class. – akrun Mar 15 '20 at 21:29
  • sorry for asking again, what should I search for if I need to also provide the N (N of Sex = 250) near the Variable name. Question has been updated. – Loizos Vasileiou Mar 15 '20 at 22:17
  • 1
    @LoizosVasileiou for this purpose, I would not replace the values to blank you can do a filter easily if you keep the varriable as such `bind_rows(list(sex = as.data.frame(tbl1), Hair = as.data.frame(tbl2)), .id = 'Variables')%>% filter(Variables == 'sex', Freq == 11) %>% pull(Freq)# [1] 11` – akrun Mar 15 '20 at 22:19