1

I'd like to create a map that shows the value of variable for a given state. The dataset contains around a thousand variables and is at the state level, for about 100 years.

The code I have and works is:

    plot_usmap(data = database, values = "var1") + scale_fill_continuous(
    low = "white", high = "blue", na.value="light gray", name = "Title of graph", label = scales::comma
  ) + theme(legend.position = "right")

Now what I'd like to do is create this map for a list of about 15 variables and 10 years. I'm usually a STATA user, and there I could define a variable list and then loop through the variable list. On page 7 of this document of "A Quick Introduction to R (for Stata Users)", I tried applying the following solution:

vars <- c("database$var1", "database$var2", "database$var3","database$var4", "database$var5", "database$var6", "database$var7", "database$var8", "database$var9", "database$var10", "database$var11", "database$var12")
for(var in vars) {
  v <- get(var)
  plot_usmap(data = darabase, values = "v") + 
    scale_fill_continuous(low = "white", high = "blue", na.value="light gray", name = "v", label = scales::comma) + theme(legend.position = "right")}

With this code, I get error "Error in get(var) : object 'database$var1' not found. When I try view(database$var1) it appears. The next problem is that I'd like the name of the graph to be the label of the variable rather than the variable. In the example above, I'd restricted the whole data to only include 1 year, so if there's a solution to set the code up that I could use the whole database but map only select years, that would be great.

Any insights would be appreciated! I read that in R, "for" isn't used as much, so if there is a better way to do it, please let me know.

PierreRoubaix
  • 167
  • 1
  • 1
  • 7

1 Answers1

1

Basically it't not that different in R. First, there is no need to use get and in general should be avoided. Second, while for loops are fine the more R-ish way would be to use lapply. Especially when making plots via ggplot2 it is recommended to use lapply.

Making use of some fake example data to mimic your database:

library(usmap)
library(ggplot2)

# Example data
database <- statepop
names(database) <- c("fips", "abbr", "full", "var1")
database$var2 <- database$var1

vars <- c("var1", "var2")

lapply(vars, function(x) {
  plot_usmap(data = database, values = x) + 
    scale_fill_continuous(
    low = "white", high = "blue", na.value="light gray", name = "Title of graph", label = scales::comma
  ) + 
    theme(legend.position = "right") +
    labs(title = x)
})
#> [[1]]

#> 
#> [[2]]

EDIT Assuming that your data contains a column with years I would suggest to wrap the plotting code inside a function which takes your database, a vectors of vars and the desired year as a argument. But there are other approaches and which works best depends on your desired result.

library(usmap)
library(ggplot2)
library(labelled)

# Example data
database <- statepop
names(database) <- c("fips", "abbr", "full", "var1")
database$year <- 2015
database <- rbind(database, transform(database, year = 2020))
var_label(database$var1) <- "Population"

vars <- c("var1")
names(vars) <- vars

map_vars <- function(.data, vars, year) {
  lapply(vars, function(x, year) {
    .data <- .data[.data$year == year, ]
    plot_usmap(data = database, values = x) +
      scale_fill_continuous(
        low = "white", high = "blue", na.value = "light gray", name = "Title of graph", label = scales::comma
      ) +
      theme(legend.position = "right") +
      labs(title = paste(var_label(database[[x]]), "in", year))
  }, year = year)  
}

map_vars(database, vars, 2015)
#> $var1

map_vars(database, vars, 2020)
#> $var1

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Thank you! That worked. Do you know how I'd change the title of the map to correspond to the variable label rather than the variable name? For example, ``` library(labelled) var_label(statepop$pop_2015) <- "Population in 2015" ``` Furthermore, if we had multiple years of data, say 2000-2020 as well, how would we then choose which years to include in the data, and incorporate the year in the titling of the graph? For example, I could imagine defining a list of variable labels and a list of years, but it seems like lapply doesn't have the capability for nesting. – PierreRoubaix Nov 22 '21 at 13:35
  • Hi Pierre. First question is easy. If your data is already labelled you could do `+ labs(title = labelled::var_label(database[[x]]))` to add the variable label as the title. Additionally besides lapply there is also mapply which allows to loop over multiple lists. Finally I just made an edit to show one approach for when your data contains multiple year and how to add them to the title. – stefan Nov 22 '21 at 20:04