0

I have a survey data with respondents answering multiple questions.

In one variable, respondents express their opinion about a party "1" = favorable and "0" = unfavorable.

In the second variable, they rate an issue as "1" = major threat, or "2" = minor threat or "3" = not a threat.

ID Country  PartyOP  ThreatPercp
1   France     1       2
2   France     1       3
3   France     0       1
4   France     1       1
5   France     0       2

My theory is that those with favorable opinion about the party are more likely to see the issue as a threat.

I want the stacked bar to show the following

X-axis: PartyOpinion Y-Axis: count or percentage or frequency color: ThreatPercp

I tried the following but it didn't work. The "fill" did not display anything

ggplot(data = France) +
  geom_bar(aes(x = PartyOp)) + 
  labs(x = "Party Opinion",
       y = "Count")
+ geom_col(aes(fill = ThreatPercp), width = 0.7)

+ theme_bw()

Any idea how to nail the stacked bar chart?

stefan
  • 90,330
  • 6
  • 25
  • 51

1 Answers1

0

One approach to achieve your desired result would be to first aggregate your dataset using e.g. dplyr:: count, then use geom_col to create your barchart.

Using some fake random example data (which includes some NAs in column ThreatPercp):

library(dplyr)
library(ggplot2)

set.seed(123)

France <- data.frame(
  ID = seq(1000),
  Country = "France",
  PartyOP = sample(c(0, 1), 1000, replace = TRUE),
  ThreatPercp = sample(c(1:3, NA), 1000, replace = TRUE)
)
France1 <- France %>%
  filter(!is.na(ThreatPercp)) %>%
  count(PartyOP, ThreatPercp, name = "count")

ggplot(data = France1) +
  geom_col(aes(x = factor(PartyOP), y = count, fill = factor(ThreatPercp))) +
  labs(
    x = "Party Opinion",
    y = "Count"
  ) +
  theme_bw()

stefan
  • 90,330
  • 6
  • 25
  • 51
  • I just wanted to point out that the data I listed above was just an example and I have about 999 rows with NA in both columns. Is there a way to adjust this code so that it runs correctly ? – Nouran Samer Apr 22 '22 at 12:07
  • Hi Nouran. In general the code would work for datasets of any size. Question is how you want to deal with NAs. I just made an edit and added a random dataset with 1000 rows and including some NAs. If you don't want any NAs to show up you drop them using e.g. `filter` or `tidyr::drop_na()`. – stefan Apr 22 '22 at 12:42
  • Thank you so much Stefan! I highly appreciate your help – Nouran Samer Apr 22 '22 at 20:01