1

I have the following data frame:

df <- data.frame(NR_HH = c('HH1','HH1','HH1','HH1','HH2','HH2'), ID = c(11,12,13,14,21,22), Age = c(28,25,16,4,45,70), Fem_Adult = c('FALSE','TRUE','FALSE','FALSE', 'TRUE','TRUE'),Male_Adult = c('TRUE','FALSE','FALSE','FALSE', 'FALSE','FALSE'), School_Child = c('FALSE','FALSE','TRUE','FALSE', 'FALSE','FALSE'), Preschool_Child = c('FALSE','FALSE','FALSE','TRUE', 'FALSE','FALSE'))

#  NR_HH ID Age Fem_Adult Male_Adult School_Child Preschool_Child
#1   HH1 11  28     FALSE       TRUE        FALSE           FALSE
#2   HH1 12  25      TRUE      FALSE        FALSE           FALSE
#3   HH1 13  16     FALSE      FALSE         TRUE           FALSE
#4   HH1 14   4     FALSE      FALSE        FALSE            TRUE
#5   HH2 21  45      TRUE      FALSE        FALSE           FALSE
#6   HH2 22  70      TRUE      FALSE        FALSE           FALSE

I want to group this data by NR_HH and build a new data frame that shows the number of female adults, male adults, school age children and preschool age children in each household. I want to get something like this:

#  NR_HH Fem_Adult Male_Adult School_Child Preschool_Child
#1   HH1         1          1            1               1
#2   HH2         2          0            0               0

I tried the following code:

df_summary =df%>%group_by(NR_HH)%>%summarise_if(is.logical, sum)

But I get this error:

Error: Can't create call to non-callable object
  • 1
    Your dataset columns are `factor`. TRUE/FALSE should not be quoted – akrun Dec 10 '18 at 12:32
  • Thanks for your answer. You are right, i wrote it like this by mistake. When I delete the quotation it works fine. However this was just a small example data set. With my original data set I keep getting the same error. My original data set has 16084 rows 1260 variables, only 4 of which are logical, the remaining are number, integer or factor. Do you have any idea why I might be getting this error? What is a non-callable object? I couldn't find any explanation about it online. – Elif Cansu Akoğuz Dec 10 '18 at 12:44
  • You can check the class of the columns and see if it is logical `sapply(yourdata, class)` I would assume the logical columns are not logical class. – akrun Dec 10 '18 at 12:46

1 Answers1

1

The issue is with the column types. These are factor columns creating by quoting the 'TRUE'/'FALSE' which results in character type. But, the data.frame call by default use stringsAsFactors = TRUE. Therefore, we get factor class for these columns. This could been avoid by simply unquoting the TRUE/FALSE input. Assuming that the input is already quoted, then convert it to logical with as.logical and then get the sum after grouping by 'NR_HH'

df %>%
   mutate_at(4:7, as.logical) %>% 
   group_by(NR_HH) %>% 
   summarise_if(is.logical, sum)
# A tibble: 2 x 5
#  NR_HH Fem_Adult Male_Adult School_Child Preschool_Child
#   <fct>     <int>      <int>        <int>           <int> 
# 1 HH1           1          1            1               1
# 2 HH2           2          0            0               0
akrun
  • 874,273
  • 37
  • 540
  • 662