0

Hi ggplot and R experts,

A newbie here. I have a usecase where I am using geom_raster for better optimisation.

Here is the reproducible R script:

require(ggplot2)
library(ggrepel)
# Create the data frame.
sales_data <- data.frame(
  emp_name = rep(c("Sam", "Dave", "John", "Harry", "Clark", "Kent", "Kenneth", "Richard", "Clement", "Toby", "Jonathan"), times = 50), 
  month = as.factor(rep(c("Jan", "Feb", "Mar", "Jan", "Feb", "Mar", "Jan", "Feb", "Mar", "Jan", "Jan"), times = 50)),
  dept_name = as.factor(rep(c("Production", "Services", "Support", "Support", "Services", "Production", "Production", "Support", "Support", "Support", "Production"), times = 50)), 
  revenue = rep(c(100, 200, 300, 400, 500, 600, 500, 400, 300, 200, 500), times = 50),
  status = rep(c("Low", "Medium", "Medium", "High", "Very High", "Very High", "Very High", "High", "Medium", "Medium", "Low"), times = 50)
)

sales_data$month <- factor(sales_data$month, levels = c("Jan", "Feb", "Mar"))
month_vector <- levels(sales_data$month)
number_of_enteries <- nrow(sales_data)

sales_data$status <- factor(sales_data$status, levels = c("Low", "Medium", "High", "Very High"))
sales_data$month <- as.integer(sales_data$month)

ggplot(sales_data, aes(x = month, y = dept_name)) +
  geom_raster(data = expand.grid(sales_data$month, sales_data$dept_name),
              aes(x = Var1, y = Var2, width=1, height=1), fill = NA, col = 'gray50', lty = 1) + #default width and height is 1
  #SAFE: geom_point(aes(size = revenue, col = revenue), 
  #           shape = 16, position = position_jitter(seed = 0), show.legend = F) +
  geom_point(aes(size = status, colour = cut(revenue, c(-Inf, 199, 301, Inf)) ), 
             shape = 16, position = position_jitter(seed = 0), show.legend = F) +
  scale_color_manual(name = "revenue", 
                     values = c("(-Inf,199]" = "red",
                                "(199,301]" = "#ffbf00", #amber
                                "(301, Inf]" = "green") ) +
  geom_text(aes(label = revenue), size=4, vjust = 1.6, position = position_jitter(seed = 0)) + #try with geom_text

  theme_bw() +
  theme(
    axis.title = element_blank(),
    axis.ticks = element_blank(),
    plot.background = element_blank(), 
    axis.line = element_blank(), 
    panel.border = element_blank(), 
    panel.grid = element_blank(),

    axis.text = element_text(colour = "blue", face = "plain", size =11)
  ) +

  scale_x_continuous(limits=c(0.5,3.5), expand = c(0,0), breaks = 1:length(month_vector), labels = month_vector) +

  # Remove extra whitespace from y-axis so lines are against the axis
  scale_y_discrete(expand = c(0,0)) +
  # Add straight lines at each factor level, shifted left/down so they're between values
  geom_hline(yintercept = as.numeric(sales_data$dept_name) + 0.5) +
  geom_vline(xintercept = as.numeric(sales_data$month) - 0.5, color = "grey")

Output Plot: enter image description here As, one can see that, lots of overlapping is there. I have 2 questions here:

  1. How we can increase the height of row so that there will be more space for geom_point. Can we use facet_grid in this case ? I am not sure here, how and whether to use facet_grid

  2. Is there any other way than position_jitter to randomly plot bubbles so that they don't overlap ?

Please help ! I am sure this question will help many beginners in future as it is not addressed anywhere in SO or other platform.

Om Sao
  • 7,064
  • 2
  • 47
  • 61
  • Not only is there a lot of overlap, but in each of the six panels, one or at most two numbers occur again and again. For this sample data at least, this might not be the most informative type of plot to begin with. And in any case, you might consider using the color to indicate the revenue (e.g., lighter color -- less revenue). – dipetkov Mar 23 '19 at 18:56
  • Hi dipetkov, this is for the demo for SO. let's say in real case I have some fields which are quite different. Then is there any way out ? – Om Sao Mar 23 '19 at 20:50
  • Wouldn't it be difficult to read that many labels? Perhaps you can consider displaying the labels for just some points? The "most important" points according to some definition of importance. – dipetkov Mar 23 '19 at 21:14
  • If the size of plot is right there will be lesser overlap and with geom_text_repel and jitter use overlap will be minimum. I want to know that can we increase the height of the row ? – Om Sao Mar 23 '19 at 23:57
  • 1
    You can specify the size of the plot when you save it with `ggsave`. See the `width` and `height` arguments. – dipetkov Mar 24 '19 at 00:01
  • So is there any way I can iteratively put points inside a section to minimize overlap. Jittering just put it randomly and leaves very less space on between points to put text labels. – Om Sao Mar 24 '19 at 00:37
  • Not sure. I would say probably not with the number of points you have in your example -- it is hard to read that many points, overlap or not. – dipetkov Mar 24 '19 at 00:48

0 Answers0