-1

I have data of the form:

[x]  [y1] [y2]
1    0.9   1
2    2.0   2
3    3.1   3

Where (x,y1) are actual values and y2 is a prediction for y1 based on a linear model estimated on another set of data. (x, y1, y2) are in a dataframe DT. How can I make a scatterplot using xyplot that graphs x on the x-axis and y1 and y2 on the y-axis but with different colors?

I was able to do this in ggplot using the following code, but think it looks much less nice than using the xyplot() command, and am wondering if I can use xyplot / lattice in this case.

ggplot(DT, aes(x)) + geom_point(aes(y=y1), color="red") + geom_point(aes(y=y2), color = "green")

Thank you very much in advance!

Mark Peterson
  • 9,370
  • 2
  • 25
  • 48
A. Elizabeth
  • 21
  • 1
  • 1
  • 2
  • 1
    A simple reproducible example would be great to get and idea of what you currently have for data. It makes it easier for us to help you. if you have a data frame named DT, try `dput(DT[1:10,])` for a nice output that will let us help you – TBSRounder Sep 22 '16 at 19:02

3 Answers3

1

The simple answer is that you need to make your data tidy for ggplot to be able to easily do what you want. Since you didn't give us an actual example to work with, I am generating some sample data then tidying it (note, using dplyr and tidyr here)

exampleData <-
  iris %>%
  filter(Species == "setosa") %>%
  slice(1:10) %>%
  select(Sepal.Length:Petal.Length)

exampleData

toPlot <-
  exampleData %>%
  gather(sepalMeasure, size, -Petal.Length)

Then, you can use the generated sepalMeasure column to color the points. For your data, you would have something that distinguished the predicted and actual points.

toPlot %>%
  ggplot(aes(x = Petal.Length
             , y = size
             , col = sepalMeasure)) +
  geom_point()

enter image description here

Mark Peterson
  • 9,370
  • 2
  • 25
  • 48
  • Thanks! I apologize for not sharing my actual data; it's organized as a dataframe with three columns, x, y1, and y2 and about 500 rows. Right now the predicted points are a separate variable (y2) from the actual points (y1), as opposed to having one y variable and a variable like SepalMeasure to distinguish groupings/colors. On the example above, it would be as if there were two different variables for size, one for each sepalMeasure. Is it possible to use a similar syntax for the graph in that case? – A. Elizabeth Sep 22 '16 at 23:29
  • `exampleData` is exactly like you describe your data. I used `gather` to create `sepalMeasure` & `size`. You will need to do the same with your data. – Mark Peterson Sep 23 '16 at 00:12
1

You can plot 2 y variables using y1 + y2 in the formula

 d=data.frame(x=1:9,y1=2:10,y2=3:11)
 library(lattice)
 xyplot(y1+y2~x,d)

EDIT: You can add a legend with

 xyplot(y1+y2~x,d,auto.key=TRUE)

Use other lattice features to control colors, labels, etc.

DaveTurek
  • 1,297
  • 7
  • 8
0

Use with(), plot() and points() functions of the base plot package in R:

with(datatable,plot(X,Y1))
with(datatable,points(X,Y2))
with(datatable,points(X,Y3))
Valerica
  • 1,618
  • 1
  • 13
  • 20