0

I want to produce a facetted scatterplot with ggplot for each scatterplot contains the entire dataset in one colour, with a single ID (of that same dataset) in a different colour on top of the whole scatter. This is the data:

**trajectories**

X    Y    ID
2    4     1
1    6     1
2    4     1
1    8     2
3    7     2
1    5     2
1    4     3
1    6     3
7    4     3

I use the following code to produce scatterplots for each ID:

ggplot(trajectories, aes(x=X, y=Y)) + 
geom_point() + 
facet_wrap( ~ ID)

How can I print each of these scatterplots on a scatterplot of the whole dataset?

Joeri
  • 157
  • 2
  • 12
  • 2
    you mean you need 1 plot. then why are you facetting? remove `facet_wrap(.)` and use `geom_point(aes(colour=ID))` – Arun Feb 25 '13 at 16:05
  • No, I do need the facetting because i want a separate plot for each ID, however for each of those plots I also wish to have the entire data set plotted (for example in black) so that I can instantly see how my individual ID's (for example in blue) are related to the entire dataset. – Joeri Feb 25 '13 at 16:17
  • 1
    you mean for each facet, you need the whole plot, but within each facet you need the points for that ID to be coloured differently? – Arun Feb 25 '13 at 16:22
  • indeed, so that i get a scatterplot for each individual ID, combined with the entire dataset. In this case it would thus give me three scatterplots, that each have a scatter of the whole dataset 'on the background' and with only the scatter of a single ID printed on the whole scatter. – Joeri Feb 25 '13 at 16:28

3 Answers3

3

The only way I can think of is to replicate the data set 3 times and set alternate ID for colours and a separate group for facetting. Assuming your data.frame is df

df$ID  <- NULL
df$ID1 <- rep(1:2, c(3,6))
df$ID2 <- c(2,2,2,1,1,1,2,2,2)
df$ID3 <- rep(2:1, c(6,3))
require(reshape2)
df.m <- melt(df, id.var=c("X", "Y"))
df.m$grp <- gl(3, 9)
df.m$value <- factor(df.m$value)

ggplot(data = df.m, aes(x = X, y = Y)) + geom_point(aes(colour = value)) + 
       facet_wrap(~ grp) + scale_colour_manual(values = c("blue", "black"))

enter image description here

Note that you have similar points within different groups and so some colours for that group are overwritten by the colour for the next group. For ex: (1,6) should be blue in the first facet but there is a (1,6) in ID=3 which therefore replaces blue to black.

Arun
  • 116,683
  • 26
  • 284
  • 387
  • 1
    The only other option I was thinking of was to make three separate graphs and then use `grid.arrange`. – joran Feb 25 '13 at 16:56
  • @joran, yes, of course. If `facetting` isn't an absolute necessity, `grid.arrange` comes in handy! – Arun Feb 25 '13 at 16:58
2

This should work:

ggplot(trajectories, aes(x=X, y=Y)) + 
  geom_point(color = ID) 

This will create a scatterplot with a color for each ID. If you want a scatter plot with just one color, just leave out the color = id bit.

To shade areas of a certain id, you can draw some inspiration from here:

How can I overlay two dense scatter plots so that I can see the outlines of each in R or Matlab?

It basically calculates a convex hull around subgroups and draws a polygon around it.

Community
  • 1
  • 1
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • This is not what i meant, What i want is to give the entire dataset one colour in a scatter plot (black), which has an overlay of the scatterplot of a single ID (blue). And this combined scatterplot of the entire dataset with a single ID i want repeated for every ID. – Joeri Feb 25 '13 at 16:25
1

In base plots you can do something like:

  par(mfrow=c(length(unique(ID)),1))
    for(i in unique(ID)){ plot(X,Y,col=as.numeric(ID==i)+1)}

enter image description here

If overplotting was a problem you can add jitter() or transparent colors.

Seth
  • 4,745
  • 23
  • 27