3

Is there a way to optimize the graph quality from R?

I have 30 million data points, and I generated a Q–Q plot and saved it as a PDF file using:

pdf(myPlot.pdf)
qqnorm(X)
dev.off()

But the PDF file size is so big that I can not even open it to view!

Is there a way either saving this with a lower quality or as a different type (I don't necessarily need PDF) so I can view the graph?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user1007742
  • 571
  • 3
  • 11
  • 20
  • 1
    `pdf` is a vector format, so the pdf file viewer would try to render thise 30 million data points on the fly. Export to `png` that is a lossless data compression raster format. – daroczig Sep 29 '13 at 21:50

2 Answers2

3

You have a few options.

  1. Don't plot all the points. Compare:

    X = rnorm(1e5)
    qqnorm(X, xlim=c(-4.5, 4.5), ylim=c(-4.5, 4.5))
    qqnorm(X[seq(1, length(X), 5)], xlim=c(-4.5, 4.5), ylim=c(-4.5, 4.5))
    qqnorm(X[seq(1, length(X), 10)], xlim=c(-4.5, 4.5), ylim=c(-4.5, 4.5))
    

    I would suggest it's almost impossible to visually notice a difference

  2. Don't use the pdf plotting device. Instead try png or jpeg. These functions have a resolution argument, res, that controls the plotting resolution. So something like this should do the trick:

    ppi = 300
    png("mygraph.png", width=6*ppi, height=6*ppi, res=ppi)
    qqnorm(X)
    dev.off()
    
csgillespie
  • 59,189
  • 14
  • 150
  • 185
0

I have found a similar question very informative, mainly with respect to the options you don't have.

For sharing the plot, I would go with showing only a sufficiently large fraction of the points; this is the scope of a plot anyway, to provide a visual overview of the data, not to necessarily include all of them. For viewing only, I would pick the PNG format.

Community
  • 1
  • 1