3

Hi R and ggplot experts, I started learning R and experimenting with ggplot.

I have a usecase as explained below.

Reproducible R code:

require(ggplot2)
library(ggrepel)

# Create the data frame.
sales_data <- data.frame(
  emp_name = rep(c("Sam", "Dave", "John", "Harry", "Clark", 
        "Kent", "Kenneth", "Richard", "Clement", "Toby", 
        "Jonathan"), times = 3), 
  month = as.factor(rep(c("Jan", "Feb", "Mar", "Jan", "Feb", "Mar", 
          "Jan", "Feb", "Mar", "Jan", "Jan"), times = 3)),
  dept_name = as.factor(rep(c("Production", "Services", "Support", 
        "Support", "Services", "Production", "Production", 
        "Support", "Support", "Support", "Production"), 
      times = 3)), 
  revenue = rep(c(100, 200, 300, 400, 500, 600, 500, 400, 300, 
              200, 500), times = 3),
  status = rep(c("Low", "Medium", "Medium", "High", "Very High", 
           "Very High", "Very High", "High", "Medium", "Medium", 
           "Low"), times = 3)
)    
sales_data$month <- factor(sales_data$month, levels = c("Jan", 
             "Feb", "Mar"))
month_vector <- levels(sales_data$month)
sales_data$month <- as.integer(sales_data$month)

sales_data$status <- factor(sales_data$status, levels = c("Low", 
                            "Medium", "High", "Very High"))

dept_vector <- levels(sales_data$dept_name)
sales_data$dept_name <- as.integer(sales_data$dept_name)

ggplot(sales_data, aes(x = month, y = dept_name)) +
  geom_raster(data = expand.grid(sales_data$month, 
        sales_data$dept_name), aes(x = Var1, y = Var2, width=1, 
          height=1), fill = NA, col = 'gray50', lty = 1) + 
  geom_point(aes(size = status ),   shape = 16, 
    position = position_jitter(seed = 0), show.legend = F) +
  scale_color_manual(name = "revenue") +
  geom_text(aes(label = revenue), size=4, vjust = 1.6, 
      position = position_jitter(seed = 0)) +    theme_bw() +
  theme(
    axis.title = element_blank(),
    axis.ticks = element_blank(),
    plot.background = element_blank(), 
    axis.line = element_blank(), 
    panel.border = element_blank(), 
    panel.grid = element_blank(),     
         axis.text = element_text(colour = "blue", face = "plain", 
         size =11)
  ) +    scale_x_continuous(limits=c(0.5,3.5), expand = c(0,0), 
     breaks = 1:length(month_vector), labels = month_vector) +
  scale_y_continuous(limits=c(0.5,3.5), expand = c(0,0), 
    breaks = 1:length(dept_vector), labels = dept_vector) +    
  geom_hline(yintercept = as.numeric(sales_data$dept_name) + 0.5) +
  geom_vline(xintercept = as.numeric(sales_data$month) - 0.5, 
     color = "grey")

Output plot:

enter image description here

As one can see, that the points plotted by geom_point are overlapping quite often. In order to solve the overlapping issue, I am having a solution in mind, but not sure, how it can be done using R. Need guidance.

Step.1) Introduce a new column in dataset (sales_data), which has number of points in each particular category combination. For example for category Feb and Services there are 6 enteries/points. So, for all the rows corresponding to such enteries, the new column's value should be 6.

Step.2) I will calculate the square root of the number of enteries in each category combination and then take ceiling of that number. For example, category case Feb and Services, there are 6 points, so ceiling(squareroot(6)) = 3. Now, I know that I have to plot 6 points by splitting x and y range for the category as 3 x 3 grid. And accordingly points will be plotted on first 6 points of those 9 grid points inside the tile for those category.

Can someone please guide me, how it can be done ? I am sure, it is very much possible, but not sure how to approach this case.

kjetil b halvorsen
  • 1,206
  • 2
  • 18
  • 28
Om Sao
  • 7,064
  • 2
  • 47
  • 61
  • 2
    You may want to take a look at [ggrepel package](https://cran.r-project.org/web/packages/ggrepel/vignettes/ggrepel.html) – ozacha Apr 03 '19 at 10:36
  • 1
    #ozacha: I have used `ggrepel` too, but still the geom_point using jittering overlaps the cricles sometimes. This is sample data, but in my real data, there are lot more points and too much overlapping – Om Sao Apr 03 '19 at 11:06
  • Maybe the answer to this question I asked a while back helps, even though my problem was about the labels and not the points: https://stackoverflow.com/questions/33258075/labelling-points-with-ggplot2-and-directlabels – erc Apr 03 '19 at 11:28
  • I think `ggpointgrid::geom_pointrect()` may come pretty close to what you're describing, @Om Sao: https://github.com/nevrome/ggpointgrid – nevrome Feb 22 '21 at 17:53

0 Answers0