1

I'd like to produce a bar chart that highlights different categories - I use geom_col with a facet for each of the countries.

The issue is that the coloured version does not align with the 'greyed' version of the specific country; instead it's always at the bottom of chart.

Steel Production chart

Produced with this code:


# Steel production data
  steel <- tribble(
    ~country,   ~"2016",    ~"2017",    ~"2018",    ~"2019",
    "China",     828.4,  853.7,  863.9,  988.2, 
    "Japan",     104.9,  104.7,  104.2,  99.6, 
    "India",     95.0,   101.5,  107.8,  111.5, 
    "USA",     80.2,     81.6,   84.2,   88.2, 
    "Other",     564.8,  577.7,  587.8,  549.9 
  )

# Pivot the data and turn country into factors
  steel_long <- tidyr::pivot_longer(steel, -country, names_to = "year", values_to = "production")
  names(steel_long) <- tolower(names(steel_long))

  steel_long$country <- as.factor(steel_long$country)
  steel_long$country <- forcats::fct_relevel(steel_long$country, "Other", after = Inf) # Always put RotW last

  steel_long$country2 <- steel_long$country # Add second country to add the grey lines on the mini charts

  steel_long$year <- lubridate::make_date(year = steel_long$year, 12, 31)

# Graph - Column
  ggplot() +
    geom_col(data = steel_long[, 2:4], 
             mapping = aes(x = year, y = production, group = country2), colour = "white", fill = "grey", lwd = 1) +
    geom_col(data = steel_long, mapping = aes(x = year, y = production, fill = country), lwd = 1.1) +
    facet_wrap(~country) +
    labs(title = "Global steel production (Source: World Steel Association)", x = "", y = "Million metric tons") +
    guides(fill = "none") +
    theme_minimal()

Is it possible to colour the specific area of the column that is related to the country?

Thank you

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
Tech Commodities
  • 1,884
  • 6
  • 13

1 Answers1

1

Try this. The basic idea is to replicate the dataset according to the number of countries. When facetting the dataset is split according to the facet var. By replicating the dataset we ensure that the column plot in each facet is made up of the whole dataset. The only difference between the single datasets is the country_fill column which is used to set a color for the country to highlight while setting the fill color to grey for all others. To set the fill colors in the plot I use scale_fill_identity.

library(tidyverse)

# Steel production data
steel <- tribble(
  ~country,   ~"2016",    ~"2017",    ~"2018",    ~"2019",
  "China",     828.4,  853.7,  863.9,  988.2, 
  "Japan",     104.9,  104.7,  104.2,  99.6, 
  "India",     95.0,   101.5,  107.8,  111.5, 
  "USA",     80.2,     81.6,   84.2,   88.2, 
  "Other",     564.8,  577.7,  587.8,  549.9 
)

# Pivot the data and turn country into factors
steel_long <- tidyr::pivot_longer(steel, -country, names_to = "year", values_to = "production")
names(steel_long) <- tolower(names(steel_long))

steel_long$country <- as.factor(steel_long$country)
steel_long$country <- forcats::fct_relevel(steel_long$country, "Other", after = Inf) # Always put RotW last

steel_long$country2 <- steel_long$country # Add second country to add the grey lines on the mini charts

steel_long$year <- lubridate::make_date(year = steel_long$year, 12, 31)

# Colors
colors <- scales::hue_pal()(5) %>% 
  setNames(unique(steel_long$country)) %>% 
  tibble::enframe(name = "country3", value = "country_fill")

# Replicate dataframe
steel_long_rep <- purrr::map(unique(steel_long$country), ~ steel_long) %>% 
  setNames(unique(steel_long$country)) %>% 
  bind_rows(.id = "country3") %>%
  # Join colors
  left_join(colors) %>% 
  # Set fill for non-facet countries to grey
  mutate(country_fill = ifelse(country != country3, "grey", country_fill))
#> Joining, by = "country3"

steel_long_rep$country3 <- forcats::fct_relevel(steel_long_rep$country3, "Other", after = Inf) 

# Graph - Column
ggplot() +
  geom_col(data = steel_long_rep, mapping = aes(x = year, y = production, group = country, fill = country_fill), colour = "white", lwd = 1) +
  scale_fill_identity() +
  facet_wrap(~country3) +
  labs(title = "Global steel production (Source: World Steel Association)", x = "", y = "Million metric tons") +
  guides(fill = "none") +
  theme_minimal()

Created on 2020-04-12 by the reprex package (v0.3.0)

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Thank you. That answers my question exactly. I see that its needs to reproduce the data for each facet and you achieve it so efficiently. – Tech Commodities Apr 12 '20 at 20:00