1

I am working with the R programming language. In a previous post (R: converting tidyverse to dplyr/reshape2 for plots), I learned how to make automatic histograms for all categorical variables in my dataset:

#create data

var_1 <- rnorm(1000,10,10)
var_2 <- rnorm(1000, 5, 5)
var_3 <- rnorm(1000, 6,18)

favorite_food <- c("pizza","ice cream", "sushi", "carrots", "onions", "broccoli", "spinach", "artichoke", "lima beans", "asparagus", "eggplant", "lettuce", "cucumbers")
favorite_food <-  sample(favorite_food, 1000, replace=TRUE, prob=c(0.5, 0.45, 0.04, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001))


response <- c("a","b")
response <- sample(response, 1000, replace=TRUE, prob=c(0.3, 0.7))


data = data.frame( var_1, var_2, var_3, favorite_food, response)

data$favorite_food = as.factor(data$favorite_food)
data$response = as.factor(data$response)

#specify categorical variables
factor_vars <- sapply(data, is.factor)

varnames <- names(data)

deselect_not_factors <- varnames[!factor_vars]

#load libraries 
library(tidyr)
library(ggplot2)

#format data
data_long <- data %>%
  pivot_longer(
    cols = -deselect_not_factors,
    names_to = "category",
    values_to = "value"
  )

#create plots
ggplot(data_long) +
  geom_bar(
    aes(x = value)
  ) +
  facet_wrap(~category, scales = "free")

enter image description here

My question : Is it possible to replace the "pivot_longer" statement using functions from the "dplyr" and "reshape2" libraries ( e.g. "melt()" )?

Thanks

stats_noob
  • 5,401
  • 4
  • 27
  • 83

1 Answers1

1

With reshape2::melt, specify the id columns in id.vars i.e. deselect_not_factors, and the corresponding arguments for names_to and values_to are variable.name and value.name

library(dplyr)
library(ggplot2)
data %>% 
   reshape2::melt(id.vars = deselect_not_factors, 
        variable.name = 'category', value.name = 'value') %>%
   ggplot() +
    geom_bar(
     aes(x = value)
       ) +
    facet_wrap(~category, scales = "free")
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you! So now, the "data_long" object is replaced with the "data" object, and ggplot2 is performed using "data" ? – stats_noob Apr 29 '21 at 18:30
  • @stats555 you are either assign to `data_long` or directly chain with ggplot as in the update – akrun Apr 29 '21 at 18:33