I have XY
data (a 2D tSNE
embedding of high dimensional data) which I'd like to scatter plot
. The data are assigned to several cluster
s, so I'd like to color code the points by cluster
and then add a single label for each cluster
, that has the same color coding as the cluster
s, and is located outside (as much as possible) from the cluster
's points.
Any idea how to do this using R
in either ggplot2
and ggrepel
or plotly
?
Here's the example data (the XY
coordinates and cluster
assignments are in df
and the labels in label.df
) and the ggplot2
part of it:
library(dplyr)
library(ggplot2)
set.seed(1)
df <- do.call(rbind,lapply(seq(1,20,4),function(i) data.frame(x=rnorm(50,mean=i,sd=1),y=rnorm(50,mean=i,sd=1),cluster=i)))
df$cluster <- factor(df$cluster)
label.df <- data.frame(cluster=levels(df$cluster),label=paste0("cluster: ",levels(df$cluster)))
ggplot(df,aes(x=x,y=y,color=cluster))+geom_point()+theme_minimal()+theme(legend.position="none")