1

I'm really struggling to achieve what feels like an incredibly basic geom_bar plot. I would like the sum of y to be represented by one solid bar (with colour = black outline) in bins of 10 for x. I know that stat = "identity" is what is creating the unnecessary individual blocks in each bar but can't find an alternative to achieving what is so close to my end goal. I cheated and made the below desired plot in illustrator. Plot that I want

I don't really want to code x as a factor for the bins as I want to keep the format of the axis ticks and text rather than having text as "0 -10", "10 -20" etc. Is there a way to do this in ggplot without the need to use summerise or cut functions on the raw data? I am also aware of geom_col and sat_count options but again, can't achive my desired outcome.

DF as below, where y = counts at various values of a continuous variable x. Also a factor variable of type.

y = c(1 ,1, 3, 2, 1, 1, 2, 1, 1, 1, 1, 1, 4, 1, 1,1, 2, 1, 2, 3, 2, 2, 1)
x = c(26.7, 28.5, 30.0, 34.8, 35.0, 36.4, 38.6, 40.0, 42.1, 43.7, 44.1, 45.0, 45.5, 47.4, 48.0, 57.2, 57.8, 64.2, 65.0, 66.7, 68.0, 74.4, 94.1)
type = c(rep("Type 1", 20), "Type 2", rep("Type 1", 2)) 
df<-data.frame(x,y,type)

Bar plot of total y count for each bin of x - trying to fill by total of type, but getting individual proportions as shown by line colour = black. Would like total for each type in each bar.

ggplot(df,aes(y=y, x=x))+ 
  geom_bar(stat = "identity",color = "black", aes(fill = type))+ 
  scale_x_binned(limits = c(20,100))+
  scale_y_continuous(expand = c(0, 0), breaks = seq(0,10,2)) +
  xlab("")+
  ylab("Total Count")

Incorrect 1

Or trying to just have the total count within each bin but don't want the internal lines in the bars, just the outer colour = black for each bar

ggplot(df,aes(y=y, x=x))+ 
  geom_col(fill =  "#00C3C6", color = "black")+ 
  scale_x_binned(limits = c(20,100))+
  scale_y_continuous(expand = c(0, 0), breaks = seq(0,10,2)) +
  xlab("")+
  ylab("Total Count")

Incorrect 2

1 Answers1

0

Here is one way to do it, with previous data transformation and geom_col:

df <- df |> 
  mutate(bins = floor(x/10) * 10) |>
  group_by(bins, type) |>
  summarise(y = sum(y))

ggplot(data = df,
       aes(y = y,
           x = bins))+ 
  geom_col(aes(fill = type),
           color = "black")+ 
  scale_x_continuous(breaks = seq(0,100,10)) +  
  scale_y_continuous(expand = c(0, 0), 
                     breaks = seq(0,10,2)) +
  xlab("")+
  ylab("Total Count")

enter image description here

Arthur Welle
  • 586
  • 5
  • 15