1

I'm new to R and I'm doing the R course from DataQuest. I have a csv of forest fires. The file can be downloaded here:

https://archive.ics.uci.edu/ml/machine-learning-databases/forest-fires/

I want to create a function that groups the data by "x" (eg. month or day) and return a bar chart of the count.

library(readr)
library(dplyr)
library(ggplot2)

forestFires <- read_csv("forestfires.csv")

forestFiresCountPlot <- function(x) {
  forestFiresGroup <- forestFires %>%
  group_by(x) %>% 
  summarise(n(x)) %>%
  ggplot(data = forestFiresGroup) + 
    aes(x = x, y = n(x)) +
    geom_bar()
}

forestFiresMonth <- forestFiresCountPlot(month)
forestFiresDay <- forestFiresCountPlot(day)

# Output - Error: Column `x` is unknown

When I call the function, how do I state that month and day are columns?

Julian
  • 411
  • 4
  • 18

2 Answers2

1

Welcome to the world of programming with dplyr/ggplot2/tidyverse. You'll want to read more about the details here, but the following will get you going:

library(tidyverse)

df <- read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/forest-fires/forestfires.csv")

plot_group <- function(df, grp) {
  grp_var <- enquo(grp)
  df %>%
    count(!! grp_var) %>%
    ggplot(aes(x = !!grp_var, y = n)) +
    geom_col()
}

plot_group(df, month)
plot_group(df, day)

Note: You may want to relevel the month and day variables first so they plot in a more expected order:

df <- df %>%
  mutate(
    month = fct_relevel(month, str_to_lower(month.abb)),
    day = fct_relevel(day, c("sun", "mon", "tue", "wed", "thu", "fri", "sat"))
  )
JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116
1

You can try something like this:

forestFiresCountPlot <- function(x) {

  forestFires %>%  
    group_by_at(x) %>% 
    summarize(n = n()) %>%
    ggplot() + 
      aes_string(x = x, y = ā€œnā€) +
      geom_bar(stat = "identity")
}

forestFiresCountPlot("month")
forestFiresCountPlot("day")
Wil
  • 3,076
  • 2
  • 12
  • 31
Sonny
  • 3,083
  • 1
  • 11
  • 19