1

I am trying to create a scatterplot using ggplot. Is there a way to stop my text labels from overlapping the trend line?

I was only able to stop overlapping the text labels from each other.

rownames = c("dummy", "dummy", "dummy", "dummy", "dummy", "dummy","dummy", "dummy", "dummy", "dummy")
corr_truth = c(-0.39, -0.13, 0.28, -0.49, -0.14, 0.52, 0.43, 0.22, -0.29, -0.02)
corr_pred= c(-0.41, 0.01, 0.36, -0.38, -0.28, 0.44, 0.26, 0.24, -0.38, -0.23)
corr_complete = data.frame(rownames, corr_truth,corr_pred)

plot_corr_complete = ggplot(data = corr_complete, aes(corr_truth, corr_pred)) + geom_point() + 
  xlim(-0.5,0.7) + 
  ylim(-0.5,0.7) +
  geom_text(label = corr_complete$rownames, nudge_x = 0.08, nudge_y = 0.005, check_overlap = T) +
  geom_smooth(method = "lm", se = FALSE, color = "black")
plot_corr_complete
Tung
  • 26,371
  • 7
  • 91
  • 115
Sarah
  • 137
  • 9

2 Answers2

3

An example using ggrepel. I needed to add some padding to the solution, so the labels did not overlap the trend line.

library(tidyverse);library(ggrepel)
rownames = c("dummy", "dummy", "dummy", "dummy", "dummy", "dummy","dummy", "dummy", "dummy", "dummy")
corr_truth = c(-0.39, -0.13, 0.28, -0.49, -0.14, 0.52, 0.43, 0.22, -0.29, -0.02)
corr_pred= c(-0.41, 0.01, 0.36, -0.38, -0.28, 0.44, 0.26, 0.24, -0.38, -0.23)
corr_complete = data.frame(rownames, corr_truth,corr_pred)

plot_corr_complete = ggplot(data = corr_complete, aes(corr_truth, corr_pred)) + geom_point() + 
  xlim(-0.5,0.7) + 
  ylim(-0.5,0.7) +
  geom_text_repel(label = corr_complete$rownames,point.padding = 0.2,
                  nudge_y = 0.005, nudge_x = 0.02) +
  geom_smooth(method = "lm", se = FALSE, color = "black")
plot_corr_complete
Henry Cyranka
  • 2,970
  • 1
  • 16
  • 21
  • Thanks @Harro, this definitely worked for the dummy set. I now added fullrange = TRUE in the geom_smooth part in order to have the trendline going through the whole plot as opposed to the data only and now the label again on top of the line. Any idea how to overcome this? Many thanks :) – Sarah Nov 06 '18 at 00:37
  • It is hard without the full dataset. However, I would try increase the point.padding parameter. – Henry Cyranka Nov 06 '18 at 01:44
1

ggrepel package provides functions to avoid texts from overlapping. Once youve installed the package, load it before running the following code Revised code worked from my machine:

rownames = c("dummy", "dummy", "dummy", "dummy", "dummy", "dummy","dummy", "dummy", "dummy", "dummy")
corr_truth = c(-0.39, -0.13, 0.28, -0.49, -0.14, 0.52, 0.43, 0.22, -0.29, -0.02)
corr_pred= c(-0.41, 0.01, 0.36, -0.38, -0.28, 0.44, 0.26, 0.24, -0.38, -0.23)
corr_complete = data.frame(rownames, corr_truth,corr_pred)

plot_corr_complete = ggplot(data = corr_complete, aes(corr_truth, corr_pred, label = rownames)) + geom_point() + 
  xlim(-0.5,0.7) + 
  ylim(-0.5,0.7) +
  geom_text_repel() +
  geom_smooth(method = "lm", se = FALSE, color = "black")
plot_corr_complete

Hope this helps

Han Soul
  • 35
  • 5