1

I'm trying to write the code to use my dataset and make a new graph for each column of a dataset, rather than have to write out a new value for y each time in the code.

I have a dataset where each row is a person, each column is a measurement in the blood (ie, insulin, glucose, etc). I have a few extra columns with descriptive categories that I"m using for my groups (ie lean, obese). I'd like to make a graph for each of those column measurements (ie, one graph for insulin, another for glucose, etc). I have 90 different variables to cycle through.

I've figured out how to a boxplot for each of these, but can't figure out how to have the code "loop"? so that I don't have to re-write the code for each variable.

Using the mtcars dataset as an example, I have it making a graph where the y is disp, and then another graph where y = hp, and then y = drat.

data("mtcars")

#boxplot with individual points - first y variable
ggplot(data = mtcars, aes(x = cyl, y = disp)) +
  geom_boxplot()+
  geom_point()

#boxplot with individual points - 2nd y variable
ggplot(data = mtcars, aes(x = cyl, y = hp)) +
  geom_boxplot()+
  geom_point()

#boxplot with individual points - 3rd y variable
ggplot(data = mtcars, aes(x = cyl, y = drat)) +
  geom_boxplot()+
  geom_point()

How do I set this up so my code will automatically cycle through all of the variables in the dataset (I have 90 of them)?

Community
  • 1
  • 1
Erin Giles
  • 73
  • 6
  • What are you going to do with the 90 plots in the end? Asking because you might want to take a look at `facet_wrap` or (even better) `ggforce::facet_wrap_paginate` after you reshaped your data from wide to long. – markus Jun 11 '20 at 18:46
  • Ok - will take a look. At this point I just want to do a quick visualization of the data. Happy to try either of those if it'll do the same thing, but put them in a nicer format. Can you help direct me on how to use that function? – Erin Giles Jun 11 '20 at 19:18

2 Answers2

2

Here's a basic solution, where you would populate vector_of_yvals with your 90 variables to loop through:

library(tidyverse)

plot_func <- function(yval){
  p <- ggplot(data = mtcars, aes(x = cyl, y = yval)) +
    geom_boxplot()+
    geom_point()
  p
}


vector_of_yvals <- c("disp", "hp", "drat")

list_of_plots <- map(vector_of_yvals, plot_func)

You can populate vector_of_yvals with all of the variables in your dataframe by doing:

vector_of_yvals <- colnames(mtcars)

This will give you a vector:

[1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear" "carb"

If you don't want to include cyl in your vector, you can filter it out like so:

vector_of_yvals <- vector_of_yvals %>% .[. != "cyl"]
Matt
  • 7,255
  • 2
  • 12
  • 34
  • Do I have to type out each of those y values, or is there a way to have this done without typing each one? ie vector_of_yvals <- c("disp":"drat") – Erin Giles Jun 11 '20 at 18:44
  • 2
    You can use this: `vector_of_yvals <- names(mtcars)[3:5]` – Ahorn Jun 11 '20 at 18:47
0

Here is a slightly different version using a for loop and the using !!sym() to evaluate the variable text string:

library(rlang)
variables<-c("disp", "hp", "drat")

for (var in variables) {
  # print(var)
   p<-ggplot(data = mtcars, aes(x = cyl, y = !!sym(var), group=cyl)) +
      geom_boxplot()+
      geom_point()
   print(p)
}
Dave2e
  • 22,192
  • 18
  • 42
  • 50
  • This is working great! I'm now trying to use the suggestion of ggforce::facet_wrap_paginate so my graphs are not 1 per page. I think the code needs to be added before the print command, but don't know what the first "facets" should be ' `facet_wrap_paginate((facets, nrow = 9, ncol = 10, scales = "free", shrink = TRUE, labeller = "label_value", as.table = TRUE, switch = NULL, drop = TRUE, dir = "h", strip.position = "top", page = 1)` – Erin Giles Jun 11 '20 at 20:57
  • To use `ggforce::facet_wrap_paginate` you will have to reshape your data from the wide form to a long format. `tidyr::pivot_longer()` is the function to perform that transformation. – Dave2e Jun 11 '20 at 21:20