2

I created several plots in R. Occasionally, the program does not match the color of the variables in the plot to the variable colors in the legend. In the attached file (Unfortunately, I can't yet attach images b/c of reputation), the first 2 graphs are assigned a black/red color scheme. But, the third chart automatically uses a green/black and keeps the legend with black/red. I cannot understand why this happens.

How can I prevent this from happening? I know it's possible to assign color, but I am struggling to find a clear way to do this.

Code:

plot(rank, abundance, pch=16, col=type, cex=0.8)
legend(60,50,unique(type),col=1:length(type),pch=16)

plot(rank, abundance, pch=16, col=Origin, cex=0.8)
legend(60,50,unique(Origin),col=1:length(Origin),pch=16)


Below is where color pattern won't match

plot(rank, abundance, pch=16, col=Lifecycle, cex=0.8)
legend(60,50,unique(Lifecycle),col=1:length(Lifecycle),pch=16)

data frame looks like this:

Plant    rank   abundance  Lifecycle    Origin   type
X         1         23       Perennial   Native  Weedy
Y         2         10       Annual      Exotic  Ornamental
Z         3         9        Perennial   Native  Ornamental
Yehuda Shapira
  • 8,460
  • 5
  • 44
  • 66
David L
  • 47
  • 1
  • 5
  • Perhaps there are more than 2 factor levels in `Lifecycle` or an `NA` or two mixed in? – Tad Dallas Aug 02 '15 at 15:39
  • `palette()` will tell you the order of colors being used for your factors, or look at `plot(1:10, col=1:10)`. Try changing to `col=unique(Lifecycle)` in your legend. – Rorschach Aug 02 '15 at 15:41
  • Good point. And my comment isn't correct, since if there was an `NA` mixed in, it would be plotted in the legend. @nongkrong 's suggestion (isolating unique values) should work. If you wanted to be absolutely sure, you could use `sort(unique(Lifecycle))` in both plot and legend commands, which would remove `NA`s and sort the factor levels. – Tad Dallas Aug 02 '15 at 15:50
  • @Tad Dallas Do you mind posting that as an answer? – N8TRO Aug 02 '15 at 15:51
  • I will post a more coherent/less wrong version of what I was describing in my comment. haha. Feel free to edit it. – Tad Dallas Aug 02 '15 at 16:05

1 Answers1

1

First, I create some fake data.

 df <- data.frame(rank = 1:10, abundance = runif(10,10,100), 
       Lifecycle = sample(c('Perennial', 'Annual'), 10, replace=TRUE))

Then I explicitly say what colors I want my points to be.

cols=c('dodgerblue', 'plum')

Then I plot, using the factor df$Lifecycle to color points.

plot(df$rank, df$abundance, col = cols[df$Lifecycle], pch=16)

When the factor df$Lifecycle is used above, it converts it to a numeric reference to cols, such that it sorts the values alphabetically. Therefore, in the legend, we just need to sort the unique df$Lifecycle values, and then hand it our color vector (cols).

legend(5, 40, sort(unique(df$Lifecycle)), col=cols, pch=16, bty='n')

Hopefully this helps.

Tad Dallas
  • 1,179
  • 5
  • 13