0

I am plotting a set of Odds Ratios for 18 variables which fall within 3 groups.

I would like the variables to be ordered by group, as well as coloured by group. So far I have managed to colour the points by group - but cannot figure out how to stop the variables from plotting alphabetically.

My script so far:

ggplot(pfhist_df, aes(x = boxOdds, y = groups)) + 
  geom_errorbarh(aes(xmin = boxCILow, xmax = boxCIHigh), size = .5, height = .2, color = "gray50") +
  geom_point(aes(colour = groups, shape = groups), size = 3.5 ) +
  geom_vline(aes(xintercept = 1), linetype = 2) +
  scale_shape_manual(values = rep(15, 5)) +
  scale_fill_manual(values = rep(c("#ffffff00", "#f0f0f090"), 9)[-1],
                    guide = "none") +
  scale_x_log10() +
  #coord_cartesian(xlim = c(0.01, 100)) +
  facet_grid(labels~., switch = "y") +
  theme_bw() +
  theme(panel.spacing.y = unit(0, "points"),
        panel.border = element_blank(),
        axis.text.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        axis.ticks.length.y = unit(0, "points"),
        strip.text.y.left = element_text(angle = 0),
        strip.background.y = element_blank(),
        strip.placement = "outside",
        axis.line = element_line()
  ) + 
  ylab("") + 
  xlab("Odds Ratio") + 
  ggtitle("Recent Exposure to Plasmodium falciparum") 

Which produces:

alphavetically ordered variables

I have tried by calling factor(groups) but this does not work - it simply gives me another legend.

My data:

pfhist_df <- data.frame(labels = c("Travel time to hospital > 10h",
                                   "Travel time to hospital > 18h",
                                   "Travel time to hospital > 28h",
                                   "Ethnicity: Dusun",
                                   "Ethnicity: Rungus",
                                   "Ethnicity: Sungoi",
                                   "Wealth: quartile 2", 
                                   "Wealth: quartile 3",
                                   "Wealth: quartile 4",
                                   "Occupation: student", 
                                   "Previously diagnosed with malaria", 
                                   "Elevation: 13 - 30m", 
                                   "Elevation > 213m", 
                                   "Travel time to clinic >7h", 
                                   "Travel time to clinic >10h", 
                                   "Head of household educated", 
                                   "Spend more than 3h outside daily", 
                                   "Travel to forest"), 
                        groups = c("Household", 
                                   "Household",
                                   "Household", 
                                   "Demographics",
                                   "Demographics",
                                   "Demographics",
                                   "Demographics",
                                   "Demographics",
                                   "Demographics",
                                   "Demographics",
                                   "Behaviour", 
                                   "Household",
                                   "Household",
                                   "Household",
                                   "Household",
                                   "Demographics",
                                   "Behaviour",
                                   "Behaviour"),
                        boxOdds = c(1.34,
                                    1.58,
                                    2.33,
                                    1.89,
                                    1.57,
                                    1.46,
                                    0.8, 
                                    0.69,
                                    0.48,
                                    0.43, 
                                    1.6, 
                                    1.26, 
                                    0.54, 
                                    1.22, 
                                    1.32,
                                    0.79, 
                                    1.57, 
                                    1.4),
                        boxCILow = c(1.09, 
                                     1.27, 
                                     1.83, 
                                     1.43, 
                                     1.21, 
                                     1.06, 
                                     0.67, 
                                     0.56, 0.38, 0.28, 1.36, 1.03, 0.4, 1.01, 1.08, 0.72, 1.11, 1.12),
                        boxCIHigh = c(1.64, 1.96, 2.95, 2.49, 2.05, 2.29, 0.99, 0.85, 0.6, 0.65, 
                                      1.88, 1.54, 0.72, 1.48, 1.63, 0.87, 2.23, 1.75))
                                   
pfhist_df$groups <- as.factor(pfhist_df$groups)                                 

bellbyrne
  • 67
  • 7
  • 1
    I get the following error trying to recreate your data: `Error in data.frame(labels = c("Travel time to hospital > 10h", "Travel time to hospital > 18h", : object 'box_pf_Labels' not found`. Check this line: `y = length(box_pf_Labels):1, ` – On_an_island Jul 22 '22 at 15:46
  • What order do you want your `groups` variable to be? You don't want them alphabetically as you mention but you don't say the order you want them to be in your plot. – On_an_island Jul 22 '22 at 16:03
  • Apologies @On_an_island I have edited the data and this should run now. The groups can be ordered as Demographics, Household, Behaviour. Thank you! – bellbyrne Jul 22 '22 at 16:13
  • If all you want is to change the order of the `groups` variable then you can do that by replacing this line: `pfhist_df$groups <- as.factor(pfhist_df$groups)` with `pfhist_df$groups <- factor(x = pfhist_df$groups, levels = c("Demographics", "Household", "Behaviour"))`. This will reorder the groups variable by your desired order. – On_an_island Jul 22 '22 at 16:23
  • Thanks @On_an_island - this has not changed the order they are plotted on the graph though. They are still alphabetical – bellbyrne Jul 22 '22 at 16:38

2 Answers2

1

What you could do is remove the facet_grid and use fct_reorder to reorder the y labels based on your groups like this:

pfhist_df$group <- factor(x = pfhist_df$group, levels = unique(pfhist_df$groups))
library(forcats)
library(ggplot2)
ggplot(pfhist_df, aes(x = boxOdds, y = fct_reorder(labels, groups))) + 
          geom_errorbarh(aes(xmin = boxCILow, xmax = boxCIHigh), size = .5, height = .2, color = "gray50") +
          geom_point(aes(colour = groups, shape = groups), size = 3.5 ) +
          geom_vline(aes(xintercept = 1), linetype = 2) +
          scale_shape_manual(values = rep(15, 5)) +
          scale_fill_manual(values = rep(c("#ffffff00", "#f0f0f090"), 9)[-1],
                            guide = "none") +
          scale_x_log10() +
          theme_bw() +
          ylab("") + 
          xlab("Odds Ratio") + 
          ggtitle("Recent Exposure to Plasmodium falciparum") 

Created on 2022-07-22 by the reprex package (v2.0.1)

Quinten
  • 35,235
  • 5
  • 20
  • 53
1

One option would be to first arrange your dataset by groups and boxOdds, then use forcats::fct_inorder to set the order of your labels in that order:

library(ggplot2)
library(dplyr)
library(forcats)

pfhist_df <- pfhist_df |> 
  arrange(groups, boxOdds) |> 
  mutate(labels = forcats::fct_inorder(labels))

ggplot(pfhist_df, aes(x = boxOdds, y = groups)) + 
  geom_errorbarh(aes(xmin = boxCILow, xmax = boxCIHigh), size = .5, height = .2, color = "gray50") +
  geom_point(aes(colour = groups, shape = groups), size = 3.5 ) +
  geom_vline(aes(xintercept = 1), linetype = 2) +
  scale_shape_manual(values = rep(15, 5)) +
  scale_fill_manual(values = rep(c("#ffffff00", "#f0f0f090"), 9)[-1],
                    guide = "none") +
  scale_x_log10() +
  facet_grid(labels~., switch = "y") +
  theme_bw() +
  theme(panel.spacing.y = unit(0, "points"),
        panel.border = element_blank(),
        axis.text.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        axis.ticks.length.y = unit(0, "points"),
        strip.text.y.left = element_text(angle = 0),
        strip.background.y = element_blank(),
        strip.placement = "outside",
        axis.line = element_line()
  ) + 
  ylab("") + 
  xlab("Odds Ratio") + 
  ggtitle("Recent Exposure to Plasmodium falciparum")

enter image description here

stefan
  • 90,330
  • 6
  • 25
  • 51