0

I spent one day looking for this answer and I'm almost giving up. Actually, I really imagine is a pretty simple situation, but I'll be glad of any help.

Let's say I have two datasets, the first get all ID of all students

library(tidyverse)
library(psych)

ds_of_students <- data.frame(id=(1:4), school=c("public","private"))

The second one has all the results of a test. Let's say each column is an ID.

ds_of_results <- structure(list(i1 = c(1, 2, 4, 4),
                                i2 = c(3, 3, 2, 2),
                                i3 = c(2, 3, 3, 5),
                                i4 = c(4, 1, 3, 2)), 
                           class = c("tbl_df", "tbl", 
                                     "data.frame"), row.names = c(NA, -4L))

And now I need to report a table of students ID , groupped by school, and they results (Actually, It's a Cronbach alpha results, what is pretty common in Psychology).

ds_of_students %>%
  group_by(school) %>%
  summarise(n=n(), 
            id = paste(id, collapse = ",")) %>% 
  mutate(item2=psych::alpha(ds_of_results[c(id)])$total[1])

I've got this message

Error in mutate_impl(.data, dots) : 
  Evaluation error: Columns `2,4`, `1,3` not found.

But When I run in the traditional fashion, it works

psych::alpha(ds_of_results[c(1,3)])$total[1]

I've tried to work with paste, noquote, gsub ans strcol

Please, run this code to have reproducible results. Thanks much!

library(tidyverse)
library(psych)
ds_of_students <- data.frame(id=(1:4), school=c("public","private"))
ds_of_results <- structure(list(i1 = c(1, 2, 4, 4),
                                i2 = c(3, 3, 2, 2),
                                i3 = c(2, 3, 3, 5),
                                i4 = c(4, 1, 3, 2)), 
                           class = c("tbl_df", "tbl", 
                                     "data.frame"), row.names = c(NA, -4L))

ds_of_students %>%
  group_by(school) %>%
  summarise(n=n(), 
            id = paste(id, collapse = ",")) %>% 
  mutate(item2=psych::alpha(ds_of_results[c(id)])$total[1])


alpha(ds_of_results[c(1,3)])$total[1]

My desired output is something like that
Desired output

And just to give some reality to my question, that's the real dataset, where I have to compute the Cronbach's alpha item the items of each group.

Real results

prosoitos
  • 6,679
  • 5
  • 27
  • 41
Luis
  • 1,388
  • 10
  • 30
  • Can you also provide your desired result for this example? – iod Nov 18 '18 at 01:57
  • when you paste, you're creating a character vector. You can't pass the string "2,4" as a subset call and expect to get the same result as passing the two integers. – iod Nov 18 '18 at 02:00
  • @iod Please, take a look at the new code. I've added an image and changed the code to make my question clear. Thanks. – Luis Nov 18 '18 at 02:04

2 Answers2

3
get_alpha <- function(x) {  
  raw_alpha <-
    psych::alpha(
      ds_of_results[, ds_of_students[ds_of_students$school == x, 1]])$total[1]
  ids <-
    paste0(names(ds_of_results[, ds_of_students[ds_of_students$school == x, 1]]),
           collapse = ",")
  data.frame(
    school = x,
    id = ids,
    raw_alpha = raw_alpha
  )
}

map_df(levels(ds_of_students$school), get_alpha)

Result

   school    id raw_alpha
1 private i2,i4      0.00
2  public i1,i3      0.85

There were several issues in your code:

  • mutate uses variables within a data frame while psych::alpha needs entire data frames. So I don't think that you can get your alpha values with mutate

  • you use $total to extract one element of the list of data frames given by psych::alpha, but that does not work in a pipeline (the pipe does not handle lists and only works with data frames)

So basically, psych::alpha, which needs entire data frames as input and outputs a list of data frames does not play well with a classic dplyr wrangling workflow.

prosoitos
  • 6,679
  • 5
  • 27
  • 41
0

I'm not sure this is what you're looking for, but try this and tell me if you're getting the expected result. Replace your summarise call like this (also note the "unlist" in the mutate call):

ds_of_students %>% mutate(id=lapply(strsplit(id,","),as.integer))
    group_by(school) %>%
    summarise(id = list(id)) %>% 
mutate(item2=psych::alpha(ds_of_results[unlist(id)])$total[1])

What I'm doing here is replacing your paste with a list, so that the numbers are retained as numbers, and can be passed to the subset call in the next step without a hitch. This will also work if id is a character, of course, assuming the column names in ds_of_results are the id's from ds_of_students. You need to pass it with unlist so that the subset gets it as a simple vector, rather than as a list with one vector element.

With your fake data, I get this error:

Some items ( i2 i4 ) were negatively correlated with the total scale and 
probably should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option# A tibble: 2 x 3
  school  id        item2       
  <fct>   <list>    <data.frame>
1 private <int [2]> -1          
2 public  <int [2]> -1          
Warning messages:
1: In cor.smooth(r) : Matrix was not positive definite, smoothing was done
2: In psych::alpha(ds_of_results[unlist(id)]) :
  Some items were negatively correlated with the total scale and probably 
should be reversed.  
To do this, run the function again with the 'check.keys=TRUE' option
3: In cor.smooth(R) : Matrix was not positive definite, smoothing was done
4: In cor.smooth(R) : Matrix was not positive definite, smoothing was done

But that might just be a problem with the fake data itself, not the code.

iod
  • 7,412
  • 2
  • 17
  • 36
  • Thanks, in my real dataset, I've got this result: `k %>% group_by(Fator) %>% summarise(Qtde=n(), item = paste(item, collapse = ","), item = list(item), Min=round(min(Carga),2), Max=round(max(Carga),2)) %>% mutate(item2=psych::alpha(items_a[unlist(item)])$total[1])` `Error in mutate_impl(.data, dots) : Evaluation error: Columns `9,15,16,17,18,19,20,21,22,23,24,25`, `1,12,13,30`, `31,32,33`, `2,3,4,5,6,7,10,11`, `36,37,38`, `28,39,40,41` not found.` – Luis Nov 18 '18 at 02:18
  • Are all these columns actually in `items_a`? Can you give me the result of `names(items_a)`? – iod Nov 18 '18 at 02:22
  • Sure, ` items_a %>% names() [1] "i1" "i2" "i3" "i4" "i5" "i6" "i7" "i8" "i9" "i10" "i11" "i12" "i13" "i14" "i15" "i16" "i17" "i18" "i19" "i20" [21] "i21" "i22" "i23" "i24" "i25" "i26" "i27" "i28" "i29" "i30" "i31" "i32" "i33" "i34" "i35" "i36" "i37" "i38" "i39" "i40" [41] "i41 ` I changed the fake code to make it easier to understand. Thanks for your help!! – Luis Nov 18 '18 at 02:25
  • I see the problem now - `item` in your actual data is not an integer but a string, so we again have the same problem. Insert this line before you `group_by(Fator)` call: `mutate(item=lapply(strsplit(item,","),as.integer))`. Also, remove the `item = paste(item, collapse = ",")` from your `summarise`. – iod Nov 18 '18 at 03:13
  • Thanks much for your support!! I'll try it and let you know the results!! =) =) – Luis Nov 18 '18 at 03:30