Creating Stacked Bar Chart With one Variable for each Bar, using melt, and ggplot

Question

This question is raising different points as the one I posted yesterday, with a better description, so I hope for your understanding. I have the following Data:

Data <- data.frame(LMX = c(1.92, 2.33, 3.52, 5.34, 6.07, 4.23, 3.45, 5.64), Thriving = c(4.33, 6.54, 6.13, 4.85, 4.26, 6.32, 5.63, 4.55), Wellbeing = c(1.92, 2.33, 3.52, 2.34, 4.07, 3.23, 3.45, 4.64))
rownames(Data) <- 1:8

Now, my aim is to generate a flipped over bar chart that is showing one bar for each variable with all bars summing up to 100% and being divided according to the values - yellow for all values from 0 to 1.99, orange for all values from 2 to 3.99, red for all values from 4 to 5.99 and green for all values from 6 to 7. More precisely, I am looking for something like this.:

Now, I tried the following code:

Data_A <- melt(cbind(Data, ind = rownames(Data)), id.vars = c('ind'))

ggplot(Data_A, aes(x = variable, y = value, fill = factor(value))) + 
geom_bar(position = "fill", stat = "identity") + 
scale_y_continuous(labels = percent_format())  + 
coord_flip()

Unfortunately, I have no idea how to group the values in those categories I mentioned above. What is more, using this code the values are not even arranged in the right order, from low to high.

Could you please give me some recommendations how to get a picture as shown above?

Also, there is one further problem: each of those 8 individuals belongs to one of two groups and I would like to distinguish the values in the light of those two groups. However, including this additional variable to my code would just melt it together with the other variables. So I don't see any way to account for the groups here as well, using for instance facet_grid() to add the group-identifier. Do you have a suggestion here as well? Should I maybe use an entirely different approach/code?

score 0 · Answer 1 · answered May 24 '18 at 20:19

0

You are OK up to the melt. Does this do what you are after?

ggplot(Data_A, aes(x = variable, y = value, fill = cut(value,breaks = c(0,2,4,6,7)))) + 
  geom_bar(position = "fill", stat = "identity") + 
  scale_y_continuous(labels = percent_format())  +
  scale_fill_manual(name="answer",values=c("yellow","orange","red","green")) +
  coord_flip()

answered May 24 '18 at 20:19

Andrew Gustar

17,295
1
22
32

Thanks so much for the answer, I finally used gather instead of melt but I guess they are both equally useful here. I added an answer below on my own concerning distinguishing between groups to which the respondents belong and am grateful for any help here as well. – Andreas G. May 25 '18 at 19:07

score 0 · Answer 2 · answered May 24 '18 at 20:34

In order to group multiple numeric fills you have to use cut() function. It will group the numbers into your desired values from -Inf to +Inf. Then these groups can be colored specifically using scale_fill_manual().

Use this code:

ggplot(Data_A, aes(x = variable, y = value)) +
  scale_y_continuous(labels = percent_format())+coord_flip()+ 
  geom_bar(position = "fill", stat = "identity",aes(fill=cut(value,c(0,2,4,6,7))))+
  scale_fill_manual(values=c("#F8F668","#F8BA5B","#F66053","#82F653"))+
  labs(fill="")+theme(panel.background = element_blank())

The output of this plot is provided below:

Hope this helps!!

Thanks very much, very useful answer! I added code in an additional answer below. Do you have an idea how to have two graphs, one for each group to which the respondents belong? — Andreas G., May 25 '18 at 19:10

nael_kl · Accepted Answer · 2018-05-27T20:47:27.777

Is this what you're looking for regarding the first part? (I advise you change colors to prevent epileptic seizures.)

Data %>%
  mutate_all(cut, c(0, 2, 4, 6, 7), right = F, ) %>% 
  gather(key = "variable", value= "value") %>% 
  ggplot(aes(x = variable, fill = value)) + 
  geom_bar(position = position_fill(reverse = TRUE)) +
  coord_flip() +
  scale_fill_manual(values=c("yellow", "orange", "red", "green"))

For the second part, a reproducible example would be useful but you can probably add a "group" variable (between gather and ggplot) and use facet_grid or facet_wrap.

--- Edited below after information about groups ---

Column selection is missing in DataG[Data_IlA$G1_ID == 2] and variable names are not the same as the one in DataG so DataG_1 cannot be created.

Does one of the suggestions below make the figure you want?

DataG %>%
  gather(key = "variable", value = "value", -Group_ID) %>%
  mutate(value = cut(value, c(0, 1.99, 3.99, 5.99, 7))) %>%
  ggplot(aes(x = variable, fill = value)) +
  geom_bar(position = position_fill(reverse = TRUE)) +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  scale_fill_manual(values=c("#19557E","#6E3B60", "#EA916A", "#EFC76C")) +
  theme(panel.background = element_blank()) +
  xlab("") + ylab("") +
  facet_grid(Group_ID ~ .)

DataG %>%
  gather(key = "variable", value = "value", -Group_ID) %>%
  mutate(value = cut(value, c(0, 1.99, 3.99, 5.99, 7))) %>%
  ggplot(aes(x = Group_ID, fill = value)) +
  geom_bar(position = position_fill(reverse = TRUE)) +
  scale_x_discrete(limits = c("Group 1","Group 2")) +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  scale_fill_manual(values=c("#19557E","#6E3B60", "#EA916A", "#EFC76C")) +
  theme(panel.background = element_blank()) +
  xlab("") + ylab("") +
  facet_grid(variable ~ .)

--- Edited below after comment on groups ---

If you need to change categories for any variable, the easiest way may be to do so before calling ggplot:

DataG %>%
  mutate(Group_ID = case_when(
    Group_ID == 1 ~ "1st group's name",
    Group_ID == 2 ~ "2nd group's name"
  )) %>% 
  gather(key = "variable", value = "value", -Group_ID) %>%
  mutate(value = cut(value, c(0, 1.99, 3.99, 5.99, 7))) %>%
  ggplot(aes(x = variable, fill = value)) +
  geom_bar(position = position_fill(reverse = TRUE)) +
  scale_y_continuous(labels = scales::percent) +
  coord_flip() +
  scale_fill_manual(values=c("#19557E","#6E3B60", "#EA916A", "#EFC76C")) +
  theme(panel.background = element_blank()) +
  xlab("") + ylab("") +
  facet_grid(Group_ID ~ .)

Thanks for your answer, very helpful! I added an answer on my own below in which I offer a reproducible example for the second part of the question - maybe you have an idea here too?? (and yes, changing the colours was necessary indeed ;) ) — Andreas G., May 25 '18 at 19:04
This is exactly what I was looking for, thanks!! One last thing: I want to give the group specific names - for the second graphic I found out how to do it "... scale_x_discrete(limits = c("Group 1","Group 2"), labels = c("Department A", "Department B")) ...". However, for the first graphic I failed. I tried to add it under labeller in the face_grid() but it just does not work: "... facet_grid(Group_ID ~ ., labeller = labeller("1" = as_labeller("Department A"), "2" = as_labeller("Department B"))) ...". Do you have an idea/ proposition here as well?? — Andreas G., May 27 '18 at 11:09
There are several ways to do so but I advise you change the values of your variables before calling `ggplot` (see edited answer). — nael_kl, May 27 '18 at 13:06
And again it works! Many thanks for your brilliant help - I wish I had the same level of knowledge ... — Andreas G., May 27 '18 at 15:49
Thanks but my knowledge here is limited to e few tidyverse tricks you can easily learn. Glad if I could help; you can accept this answer if your problem is solved. — nael_kl, May 27 '18 at 20:46

score -1 · Answer 4 · answered May 25 '18 at 19:02

thanks to the very helpful answers, I was able to put together the following code to answer the first question I originally asked:

DataG <- data.frame(LMX = c(1.92, 2.33, 3.52, 5.34, 6.07, 4.23, 3.45, 5.64), Thriving = c(4.33, 6.54, 6.13, 4.85, 4.26, 6.32, 5.63, 4.55), Wellbeing = c(1.92, 2.33, 3.52, 2.34, 4.07, 3.23, 3.45, 4.64) , Group_ID = c(1, 2, 1, 2, 2, 2, 1, 1))
rownames <- 1:8


DataG[Data_IlA$G1_ID == 2] %>%
  select("Leader-Member-Exchange" = LMX, "Thriving" = Thriving, "Wellbeing" = Wellbeing) %>% 
  na.omit -> DataG_1

DataG_1 %>%
  mutate_all(cut, c(0, 1.99, 3.99, 5.99, 7) ) %>%
  gather(key = "variable", value = "value") %>%
  ggplot(aes(x = variable, fill = value)) +
  geom_bar(position = position_fill(reverse = TRUE)) +
  scale_y_continuous(labels = percent_format()) +
  coord_flip() +
  scale_fill_manual(values=c("#19557E","#6E3B60", "#EA916A", "#EFC76C")) +
  theme(panel.background = element_blank())

Now, concerning the second question I originally raised: as you can see in the source-data (DataG) above, I was adding another variable, G1_ID, which is a group identifier - every respondent belongs to one of two groups. I would like to show separate bar graphs for the values for each group. As you can see in the code, I was adding "[Data_IlA$G1_ID == 2]" behind the source-data DataG in order to have R only consider the values which belong to observations that belong to group 2. However, this addition to the code does not change anything at all. Why is that? What other code could I use to distinguish between the two groups? Should I resort to Facet_grid() instead?

Thank you so much for your comments,

Andreas

Creating Stacked Bar Chart With one Variable for each Bar, using melt, and ggplot

4 Answers4

Linked