3

I've checked for an answer to my problem, and the closest I could find was here: Why does plot behave differently for same but scaled data?. I understand atomic objects and already convert to a data frame.

I've loaded some chemical Reaction data:

   library(car)

    theURL <- "http://lib.stat.cmu.edu/datasets/Andrews/T30.1"
    theNames <- c("Table", "Number", "Row", "Experiment", "Temperature", 
    "Concentration", "Time", "Unchanged", "Converted", "Unwanted")
    Reaction <- read.table(theURL, header = F , col.names = theNames)
    Reaction <- Reaction[-c(1:4)]

I then draw a scatterplot; solid lines through the mean of X and the mean of Y; dotted lines at means ± 2SDs. Also a segment of slope Sy/Sx drawn at ± 5SDs because I couldn't get abline() to draw it.

    scatterplot(Reaction$Temperature, Reaction$Converted, smooth = FALSE, 
    regLine = FALSE, grid = FALSE, xlim = c(150, 185), xlab = "Temperature", ylab = "% Converted", main = "Reaction Results", ylim = c(45, 70))
    TempMean = mean(Reaction$Temperature)
    ConvMean = mean(Reaction$Converted)
    TempSD = sd(Reaction$Temperature)
    ConvSD = sd(Reaction$Converted)
    abline(col = c("red", "green"), v = TempMean, h = ConvMean)
    abline(col = "green", lty = "dotted", v = (c(TempMean - 2*TempSD, TempMean + 2*TempSD)))
    abline(col = "red", lty = 3, h = (c(ConvMean - 2*ConvSD, ConvMean + 2*ConvSD)))
    segments(TempMean - 5*TempSD, ConvMean - 5*ConvSD, TempMean + 5*TempSD, ConvMean + 5*ConvSD)

Bonferroni(?) Limits for bivariate data

...and now the big reveal. If I scale everything, the scatterplot essentially does the same thing.

    # Scale Reaction Data
    Reaction.scaled <- as.data.frame(scale(Reaction))
    # Mean and sd Lines
     scatterplot(Reaction.scaled$Temperature, Reaction.scaled$Converted, smooth = FALSE, regLine = FALSE, grid = FALSE, xlab = "Temperature", ylab = "% Converted", main = "Reaction Results")
    TempMean = mean(Reaction.scaled$Temperature)
    ConvMean = mean(Reaction.scaled$Converted)
    TempSD = sd(Reaction.scaled$Temperature)
    ConvSD = sd(Reaction.scaled$Converted)
    abline(col = c("red", "green"), v = TempMean, h = ConvMean)
    abline(col = "green", lty = "dotted", v = (c(TempMean - 2*TempSD, TempMean + 2*TempSD)))
    abline(col = "red", lty = 3, h = (c(ConvMean - 2*ConvSD, ConvMean + 2*ConvSD)))
    segments(TempMean - 5*TempSD, ConvMean - 5*ConvSD, TempMean + 5*TempSD, ConvMean + 5*ConvSD)

Scaled Plot

...but the drawing doesn't show the scaled mean at (0,0). I suspect this is something to do with high-level vs low-level graphics functions.

  • The fact that you have an error message makes me wonder about what might be going on. What are the actual means? – Elin Mar 06 '19 at 03:29
  • There is no error message, actually. Something that I forgot to remove from the reprex(), which gives an error on everything when you copy in small chunks. – Sciolism Apparently Mar 06 '19 at 03:30
  • I shouldn't presume that the means won't help. `> TempMean` `[1] 167.3158` `> ConvMean` `[1] 56.33684` – Sciolism Apparently Mar 06 '19 at 03:32
  • How does this show up on the plot: `abline(col = c("red", "green"), v = 0, h = 0)` – DanY Mar 06 '19 at 06:03
  • `abline(col = c("red", "green"), v = 0, h = 0)` produces exactly the same thing (i.e., the second graph). – A. S. K. Mar 06 '19 at 06:19
  • I think the first graph has the same problem; it's just less obvious because the means aren't zero. `mean(Reaction$Converted) = 56.34`, but to my eye the red line is between 55 and 56. – A. S. K. Mar 06 '19 at 06:24
  • Hmm. From what little I know, this sounds like a difference between a plot axis and a user axis. I’m so light on the lingo that I’m not even sure what they’re called, but somewhere I read that there are multiple coordinate systems, and I’m clearly not sending the lines to the right ones. Curious, though, that the points and the abline() and segment() commands don’t use the same system. – Sciolism Apparently Mar 06 '19 at 07:02
  • I think you're right. This example illustrates the problem directly: `library(car); test.data = expand.grid(x = 1:5, y = 1:5); scatterplot(test.data$x, test.data$y, smooth = F, regLine = F); points(test.data$x, test.data$y, pch = 4, col = "red")`. Looks like [this person](https://stackoverflow.com/questions/20433063/add-x-y-line-to-scatterplot) encountered the same problem. Would [ggExtra](https://github.com/daattali/ggExtra) work instead? – A. S. K. Mar 06 '19 at 07:19
  • I’m used to ggPlot and ggExtra, which is why I’m having trouble with all the base R stuff. I’m doing an intro stat book where I was told to use R (huzzah!) but that I was writing a book about stats, not R. I didn’t get far before my explanations of ggPlot and dplyr we’re taking over the beginning of the book, so I’m back to base. – Sciolism Apparently Mar 06 '19 at 08:11
  • Just use base `plot` function. The error does not occure with that function. You should probably report this to the maintainer of `car`, probably John Fox – IRTFM Mar 06 '19 at 09:51
  • I meant the means of the scaled variables. – Elin Mar 06 '19 at 12:11
  • The means of the scaled variables were essentially zero. On the order of 10^-9 or 10^-10. – Sciolism Apparently Mar 06 '19 at 14:38

1 Answers1

3

As always, this problem was due to my own misunderstanding of the documentation.

Under ?scatterplot

reset.par
if TRUE (the default) then plotting parameters are reset to their previous values when scatterplot exits; if FALSE then the mar and mfcol parameters are altered for the current plotting device. Set to FALSE if you want to add graphical elements (such as lines) to the plot.

And, indeed, adding reset.par = FALSE to the car-based scatterplots above works. Try it yourself for fun at home!

The relevant graphic that caused the question, with the correction: Scaled data with solid lines at means, dotted lines at means±2SDs