2

I trying to get expss use_labels to work with dplyr logic - see example below.

The vignette states the following under use_labels. By now variable labels support available only for expressions which will be evaluated inside data.frame. Is this the issue I'm having here?

#
##########################################
library(expss)
library(tidyverse)
data(mtcars)
mtcars = apply_labels(mtcars,
                      mpg = "Miles/(US) gallon",
                      cyl = "Number of cylinders",
                      disp = "Displacement (cu.in.)",
                      hp = "Gross horsepower",
                      drat = "Rear axle ratio",
                      wt = "Weight (1000 lbs)",
                      qsec = "1/4 mile time",
                      vs = "Engine",
                      vs = c("V-engine" = 0,
                             "Straight engine" = 1),
                      am = "Transmission",
                      am = c("Automatic" = 0,
                             "Manual"=1),
                      gear = "Number of forward gears",
                      carb = "Number of carburetors"
)

# table with caption from label - labels working
cro_cpct(mtcars$am, mtcars$vs) %>% set_caption(var_lab(mtcars$am))

## This works as expected - now to get this with expss use_labels.
mtcars %>%
 group_by(am) %>%
  summarise(
   freq = n()
)
#######
#am             freq
#<labelled>    <int>
# 1 0             19
# 2 1             13
########################

#### This doesn't work - i.e. not labelled
use_labels(mtcars %>%
  group_by(am) %>%
   summarise(
    freq = n()
))
## Error in substitute_symbols(expr, c(substitution_list, list(..data = quote(expss::vars(other))))) : 
  # argument "expr" is missing, with no default

If labels can't be used with dplyr logic does anyone know another package that can do labels with dplyr? Regards

MarkWebb
  • 59
  • 4

2 Answers2

2

You could use the ...data parameter to access data in expression and values2labels (Thanks to @Gregory Demin) to get the labels.

library(expss)  
use_labels(mtcars, ..data %>% 
                      group_by(am) %>% 
                      summarise(freq = n()) %>% values2labels)
# A tibble: 2 x 2
#  Transmission  freq
#  <labelled>   <int>
#1 Automatic       19
#2 Manual          13
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks for this but there are NO labels in the tibble. The labels are like Automatic & Manual – MarkWebb Dec 06 '19 at 07:18
  • ahh..I see. Sorry I couldn't find a way to do that. I'll try to check later. – Ronak Shah Dec 06 '19 at 08:34
  • @MarkWebb `use_labels` is for variable_labels. In `summarise` value labels are preserved but is not shown in the result. To show them use `values2labels` as last action in your chain, just after the summarize. – Gregory Demin Dec 06 '19 at 08:38
  • @GregoryDemin Thanks, that is new to me. Would you mind if I update that part in my answer? Or do you want to post it as separate answer ? – Ronak Shah Dec 06 '19 at 08:49
  • 1
    @RonakShah It's better that you update your answer. There is no reasons for two almost identical posts. – Gregory Demin Dec 06 '19 at 09:05
  • Thanks you guys - much appreciated. – MarkWebb Dec 06 '19 at 13:27
0

Package foreign does keep the value labels. Haven does not, or at least not by default and I can't figure out how to make it happen. You can confirm this by inspecting str() of the imported data. If you don't see the labels in there somewhere, they didn't get imported. Here I'm importing a Stata data set with value labels for union and female, both of which are 0/1 underlying. The cro_cpct command uses the value labels, but not the variable labels, when imported with package foreign (read.dta). When imported with package haven (read_dta), cro_cpct uses the variable labels but not the value labels.

    library(expss)
    library(tidyverse)
    library(foreign)
    wages <- read.dta("c:/users/pjargowsky/documents/course/qm2/wages.dta")
    cro_cpct(wages$female, wages$union) %>% set_caption(var_lab(wages$female))

#
# |              |              | wages$union |      |
# |              |              |          no |  yes |
# | ------------ | ------------ | ----------- | ---- |
# | wages$female |         male |        50.5 | 70.8 |
# |              |       female |        49.5 | 29.2 |
# |              | #Total cases |       438.0 | 96.0 |

library(haven)
wages <- read_dta("c:/users/pjargowsky/documents/course/qm2/wages.dta")
cro_cpct(wages$female, wages$union) %>% set_caption(var_lab(wages$female))

# Gender                                                    
# |        |              | Union Membership |      |
# |        |              |                0 |    1 |
# | ------ | ------------ | ---------------- | ---- |
# | Gender |            0 |             50.5 | 70.8 |
# |        |            1 |             49.5 | 29.2 |
# |        | #Total cases |            438.0 | 96.0 |
Jargp
  • 1
  • 1
  • Could you try `wages = add_labelled_class(wages)` after the import with haven? Will it works? – Gregory Demin May 29 '20 at 11:47
  • Greg, that did not work for me. Is there an option on read_dta() that I need? – Jargp May 29 '20 at 21:57
  • Could you start with clean session and load expss after the haven package? After that try again `wages = add_labelled_class(wages)`. If it won't work, please provide result of the `dput(wages[1:10, c("female","union")])`. – Gregory Demin May 29 '20 at 22:20
  • Greg, I think I figured it out. The Stata data set that was my target is very old (2002). Who knows what version of Stata that was. I've used in teaching for years. I re-saved it as Stata 16 dataset. Now package foreign won't even read it, but package haven does and the value labels are included and used. Thanks for your help. – Jargp May 29 '20 at 23:03