0

I have a dataset that I would like use to create several graphs via ggplot. In particular, I'm interested in creating some graphs that will display proportions of one variable at a time by groups of interest.

For example, if I had the following data (taken from https://r-graph-gallery.com/48-grouped-barplot-with-ggplot2):

> specie <- c(rep("sorgho" , 6) , rep("poacee" , 6) , rep("banana" , 6) , rep("triticum" , 6) )
> condition <- rep(c("normal" , "stress" , "Nitrogen") , 8)
> year <- sample(2010:2015, 24, T)
> value <- abs(rnorm(24 , 0 , 15))
> data <- data.frame(specie,condition,year,value)
> data
     specie condition year      value
1    sorgho    normal 2011 17.8037935
2    sorgho    stress 2010  8.6299341
3    sorgho  Nitrogen 2012  1.2871026
4    sorgho    normal 2014  9.8643681
5    sorgho    stress 2011  5.1731324
6    sorgho  Nitrogen 2012  9.2649930
7    poacee    normal 2014  6.2520248
8    poacee    stress 2014 13.2160232
9    poacee  Nitrogen 2014 29.9889261
10   poacee    normal 2013 10.7256729
11   poacee    stress 2012 16.4558228
12   poacee  Nitrogen 2010 17.1326142
13   banana    normal 2011  5.6082405
14   banana    stress 2011 29.4525063
15   banana  Nitrogen 2011 18.0773934
16   banana    normal 2014 24.0420069
17   banana    stress 2015  4.9791529
18   banana  Nitrogen 2013 29.3568770
19 triticum    normal 2012  0.5611722
20 triticum    stress 2011  3.3871517
21 triticum  Nitrogen 2014 24.8600963
22 triticum    normal 2011  8.7093930
23 triticum    stress 2011 25.5382877
24 triticum  Nitrogen 2015  9.9251351

I would be interested in displaying the proportion of an individual condition by year. So for example, if I wanted to create my graph of interest in Excel to just view the Nitrogen condition, it would look like:enter image description here

What would be the smartest way to create said graph via ggplot? My actual dataset has many more variables, so I'm trying to avoid creating summarised datasets with the percentage calculation for each individual variable worked out, if that makes sense.

From there I'd like to create a combo bar graph to display the raw numbers too like the following image, but I think if I can work out displaying proportions I'll be ok to figure out the rest from other stackoverflow questions like Combining Bar and Line chart (double axis) in ggplot2.

enter image description here

cstrudels
  • 17
  • 3

2 Answers2

1

Is this what you are looking for :

enter image description here

Code:

ggplot(data, aes(x=year, y=value))+
  geom_col(position = "stack", fill = "darkblue")+
  stat_summary(fun=mean, geom="line", colour= "orange", size = 2)+
  facet_wrap(~condition)
Bushidov
  • 713
  • 4
  • 16
1

Here is a solution.
Given that there are three groups, first compute proportions by year and condition, creating a summary data set. Then plot it, with a secondary axis scaled to the maximum value.

set.seed(2022)
specie <- c(rep("sorgho" , 6) , rep("poacee" , 6) , rep("banana" , 6) , rep("triticum" , 6) )
condition <- rep(c("normal" , "stress" , "Nitrogen") , 8)
year <- sample(2010:2015, 24, T)
value <- abs(rnorm(24 , 0 , 15))
data <- data.frame(specie,condition,year,value)

suppressPackageStartupMessages({
  library(ggplot2)
  library(dplyr)
})

smrydata <- data %>%
  group_by(year) %>%
  mutate(total = sum(value)) %>%
  group_by(condition, year) %>%
  summarise(value = sum(value),
            prop = value/sum(total), 
            .groups = "drop")

Max <- max(smrydata$value)

ggplot(smrydata, aes(year, value)) +
  geom_col(fill = "steelblue") +
  geom_line(aes(y = prop * Max), colour = "orange", size = 2) +
  scale_y_continuous(sec.axis = sec_axis( ~ ./Max, labels = scales::percent)) +
  facet_wrap(~ condition) +
  theme_bw()

Created on 2022-09-12 by the reprex package (v2.0.1)

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • I was hoping to avoid having to manually add a proportions column but I guess it's the easiest thing to do. Although I think you don't want your proportion to be divided by the sum of total, just total itself. But this is suitable for me, thanks – cstrudels Sep 18 '22 at 08:32