1

I'm trying to run a set of frequency tables in R without having to write the code for every single variable. For example, using the mtcars data in SPSS I would so something like:

FREQUENCIES mpg TO vs 

And it would give me the 8 frequency tables for the variables between mpg and vs. I'm trying to get this effect in R using the summarytools function freq or the sjPlot function view_df. I can do it using freq but you have to list the names of all of the variables instead of using a command like TO. And I can do it using view_df but you have to know the column positions of the variables (I have thousands of variables so that's not going to work). Please take a look at what I've got below.

#####USING FREQ IN SUMMARY TOOLS
library(summarytools)

freq(mtcars[ ,c("mpg", "cyl", "disp", "hp", "drat", "wt", "qsec", "vs")])  #works fine, but I don't want to have to list the names of all of the variables 

#####USING VIEW_DF IN SJPLOT
library(sjPlot)
view_df(mtcars[, c(1:8)],     #I want to be able to say c(mpg:vs)
        show.na = TRUE, 
        show.type = TRUE, 
        show.frq = TRUE, 
        show.prc = TRUE, 
        show.string.values = TRUE, 
        show.id = TRUE)

####A FEW EXTRA STEPS USING THE EXPSS PACKAGE

I know you can use the %to% in the expss package. I've got my own data and variable names here, sorry!

# table with counts
counts = calculate(olbm_na_A, cro(mdset(S06_01_NA %to% S06_99_NA), list("Count")))

# table with percents
percents = calculate(olbm_na_A, cro_cpct(mdset(S06_01_NA %to% S06_99_NA), list("Column, %")))

# combine tables
expss_output_viewer() 
(counts %merge% percents)

I expect to have it print out a sequence of frequency tables. I want to be able to use some command that basically means var1 to var10. I can't figure out how do this TO command. I expect it varies by what package you're using.

Gregory Demin
  • 4,596
  • 2
  • 20
  • 20
Carley
  • 63
  • 6
  • lapply(df[, column_selection], table, useNA=“ifany”) – lefft May 03 '19 at 17:12
  • The `dplyr` functions use indexing such as `mpg:vs`. You can use that in a function like `summarize_at` to carry out the same summary function(s) on all the columns from `mpg` to `vs` – camille May 03 '19 at 17:38

4 Answers4

0

I think the easiest way to do this is to use grep and colnames to return the column index of the variables by name.

grep("mpg", colnames(mtcars)) : grep("vs", colnames(mtcars)) 

gets turned into c(1:8) by first finding the position of "mpg" in the column names of mtcars (which is 1) and then the position of "vs" (which is 8). You can then use your view_df or freq solutions as shown below, or there are many other ways to apply this.

freq(mtcars[grep("mpg", colnames(mtcars)) : grep("vs", colnames(mtcars)), ]) 

view_df(mtcars[, grep("mpg", colnames(mtcars)) : grep("vs", colnames(mtcars))],     #I want to be able to say c(mpg:vs)
        show.na = TRUE, 
        show.type = TRUE, 
        show.frq = TRUE, 
        show.prc = TRUE, 
        show.string.values = TRUE, 
        show.id = TRUE)
jsizzle
  • 78
  • 8
0

There is a fre function in the expss package:

library(expss)
data(mtcars)
mtcars = apply_labels(mtcars,
                      mpg = "Miles/(US) gallon",
                      cyl = "Number of cylinders",
                      disp = "Displacement (cu.in.)",
                      hp = "Gross horsepower",
                      drat = "Rear axle ratio",
                      wt = "Weight (lb/1000)",
                      qsec = "1/4 mile time",
                      vs = "Engine",
                      vs = c("V-engine" = 0,
                             "Straight engine" = 1),
                      am = "Transmission",
                      am = c("Automatic" = 0,
                             "Manual"=1),
                      gear = "Number of forward gears",
                      carb = "Number of carburetors"
)

# as.list is needed to process data.frame as several variables rather than multiple response
calculate(mtcars, fre(as.list(vs %to% carb)))

Generally speaking, you can use %to% inside calculate with any other function from any package. %to% simply returns data.frame, e.g vs %to% carb is identical to mtcars[, c("vs", "am", "gear", "carb")].

Example with sjPlot:

library(sjPlot)
calc(mtcars, view_df(vs %to% carb))
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20
  • IT'S THE AS.LIST!!! I was somehow missing the as.list and this is what I wanted! Thank you! – Carley May 09 '19 at 19:34
0

A frequency-table in SPSS-style, from A to B, is quite easy to perform, using the sjmisc-package:

library(sjmisc)
frq(mtcars, mpg:vs)
# output in browser, to copy/paste to Word
frq(mtcars, mpg:vs, out = "b")

See ?frq for examples and different options for selecting variables, computing frequencies on grouped data frames, grouping variables with many unique values etc. And frq() also works with labelled data (see some examples in this vignette).

sjPlot::view_df() creates a code-plan and is a bit overloaded for simple frequency-tables, although you can also show frequencies as well. There's a recent blog-post showing some examples.

Daniel
  • 7,252
  • 6
  • 26
  • 38
  • Thanks everyone! Super helpful! I haven't been able to get back to this yet, but I've got a lot of things to try! I knew I was making this too complicated. – Carley May 07 '19 at 15:30
0

Already very good solutions posted, but here's one combining summarytools::freq() and dplyr::select() that hasn't been mentionned:

library(summarytools)
library(dplyr)
data("mtcars")
st_options(freq.ignore.threshold = nrow(mtcars))
mtcars %>% select(mpg:vs) %>% freq()

Note that we changed summarytools' option freq.ignore.threshold that is used to decide which variables to ignore when a whole data frame is to passed to freq(). Numerical variables having more than that number (25 by default) of distinct values will be ignored. If we set this to the number of rows of mtcars, we make sure all variables will be included.

Dominic Comtois
  • 10,230
  • 1
  • 39
  • 61