1

I want to incrementally build a plot that contains several data series of different lengths. My goal is to be able to control the appearance of each data series, give them custom names and to have appropriate legends. My data series are of different lengths, so I cannot put them in a single dataframe. In the code below I expect 4 lines: the shortest will be red, the next ones will be blue, green and black respectively

    library(ggplot2)
    set.seed(12345)
    plt <- ggplot()
    colors <- c('red', 'blue', 'green', 'black')
    for(i in seq(length(colors))) { 
      x <- seq(1, 2*i)
      y <- x * i + rnorm(length(x))
      df <- data.frame(x=x, y=y)
      plt <- plt +  geom_point(aes(x, y), data=df, color=colors[i]) + 
        geom_line(aes(x, y), data=df, color=colors[i]) 
    }
    print(plt)

This is what I get. my plot

How can I give names to the lines and display a legend? Is there a better way to acheive my goal?

David D
  • 1,485
  • 4
  • 15
  • 19

2 Answers2

2

The way to do this is to create a single data frame in long format:

Like this:

library(ggplot2)
set.seed(12345)
colors <- c('red', 'blue', 'green', 'black')
dat <- lapply(seq_along(colors), function(i){
  x <- seq(1, 2*i)
  data.frame(
    series = colors[i],
    x = x,
    y = x * i + rnorm(length(x))
  )}
)
dat <- do.call(rbind, dat)

Now plot

ggplot(dat, aes(x, y, color=series)) + geom_line() 

enter image description here

Andrie
  • 176,377
  • 47
  • 447
  • 496
  • 1
    nice, but there is a problem: series names (colors) do not match line colors. – David D Feb 04 '13 at 15:33
  • Yes, but does this really matter? I mean, your real labels will probably refer to real things, not the colour of the line? In the event it does matter, you can control this with `scale_colour_manual()` or any of the other colour scales. – Andrie Feb 04 '13 at 15:37
2

You don't have to use a for-loop and plot each time after constructing the data because they are of unequal lengths. This is why ggplot2 is awesome! You can create a group for each of the dataset. And you can name the line to whatever you want using the same group as it will appear in the legend as such (of course you can change it directly in the legend as well, later, if you wish). Here's what I think you expect:

set.seed(12345)
require(ggplot2)
require(plyr)
# to group your data. change the letters to whatever you want to appear as legend
line_names <- letters[1:4]
# Use plyr to create your x and y for each i and add the group.
dat <- ldply(1:length(colors), function(i) {
    x <- seq(1, 2*i)
    y <- x * i + rnorm(length(x))
    data.frame(x=x, y=y, grp=line_names[i])
})

# just plot here.
ggplot(data = dat, aes(x=x, y=y)) + geom_line(aes(colour=grp)) + geom_point()

ggplot2_multiple_lines

Arun
  • 116,683
  • 26
  • 284
  • 387