1

The data for some of these types graphs that I'm graphing in R,

http://graphpad.com/faq/images/1352-1(1).gif

has outliers that are way out of range and I can't just exclude them. I attempted to use the axis.break() function from plotrix but the function doesn't rescale the y axis. It just places a break mark on the axis. The purpose of doing this is to be able to show the medians for both groups, as well as the data points, and the outliers all in one plot frame. Essentially, the data points that are far apart from the majority is taking up a chunk of space and the majority of points are being squished, not displaying much differences. Here is the code:

https://gist.github.com/9bfb05dcecac3ecb7491

Any suggestions would be helpful.

Thanks

pnuts
  • 58,317
  • 11
  • 87
  • 139
crazian
  • 649
  • 4
  • 12
  • 24

2 Answers2

3

Unfortunately the code you link to isn't self-contained, but possibly the code you have for gap.plot() there doesn't work as you expect because you are setting ylim to cover the full data range rather than the plotted sections only. Consider the following plot:scatterplot with gap in y axis

As you can see, the y axis has tickmarks for every 50 pg/ml, but there is a gap between 175 and 425. So the data range (to the nearest 50) is c(0, 500) but the range of the y axis is c(0, 250) - it's just that the tickmarks for 200 and 250 are being treated as those for 450 and 500.

This plot was produced using the following modified version of your code:

## made up data
GRO.Controls <- c(25, 40:50, 60, 150)
GRO.Breast <- c(70, 80:90, 110, 500)

##Scatter plot for both groups
library(plotrix)
gap.plot(jitter(rep(0,length(GRO.Controls)),amount = 0.2), GRO.Controls,
         gap = c(175,425), xtics = -2, # no xtics visible
         ytics = seq(0, 500, by = 50),
         xlim = c(-0.5, 1.5), ylim = c(0, 250), 
         xlab = "", ylab = "Concentrations (pg/ml)", main = "GRO(P=0.0010)")
gap.plot(jitter(rep(1,length(GRO.Breast)),amount = 0.2), GRO.Breast,
         gap = c(175, 425), col = "blue", add = TRUE)

##Adds x- variable (groups) labels
mtext("Controls", side = 1, at= 0.0)
mtext("Breast Cancer", side = 1, at= 1.0)

##Adds median lines for each group
segments(-0.25, median(GRO.Controls), 0.25, median(GRO.Controls), lwd = 2.0)
segments(0.75, median(GRO.Breast), 1.25, median(GRO.Breast), lwd = 2.0, 
    col = "blue")
Heather Turner
  • 3,264
  • 23
  • 30
  • Hi Heather, thank you for your reply. I tried to fiddle around with the ylim and create more gaps because my data looks a bit more like this: You can see or notice that the 500 point is missing. How do I deal with that? – crazian Sep 15 '12 at 00:28
  • `gap.plot` only works with one or two gaps, so passing a vector of more than four numbers to the `gap` argument will produce strange results! In your example, the gaps between 60 and 110, and between 150 and 500 are the biggest, so you could use `gap = c(65, 85, 165, 435)` to sit nicely between tickmarks at 25 or 50 pg/ml intervals. This would give three plotted sections: 0-65, 85-165 and 435-500, so you would need `ylim = 210`. Of course, you may prefer to highlight the observation at 60 as an outlier and put the first gap lower down, but hopefully you get the general idea. – Heather Turner Sep 17 '12 at 09:12
  • Hi Heather, what if my data looks like this? v <- c(0.2, 2.0, 1.4, 5.3, 0.4, 0.5, 0.7) c <- c(2.3, 2.5, 4.2, 3.7, 6.2, 4.1, 3.9, 4.5, 29, 100, 400, 284) I want to show that the median in the c group is higher than the v group, but because of the tremendously high outliers, the two medians appear to look the same on the scale. Do you have any ideas how I can plot this out on a scatter? – crazian Sep 18 '12 at 00:44
  • Well, you could use something like `gap = c(107, 272, 286, 392), ytics = seq(0, 500, by = 20), ylim = c(0, 130)` but the result isn't great. Better in this case to use `plot()` with `log = "y", ylim = c(0.2, 400), las = 1`. You can also use `log = "y"` with other plot functions, e.g. `boxplot`. – Heather Turner Sep 18 '12 at 13:15
2

You could be using gap.plot() which is easily found by following the link on the axis.break help page. There is a worked example there.

IRTFM
  • 258,963
  • 21
  • 364
  • 487