2

I usually take the tikz-pgfplot route through gnuplot-lua interface to draw scientific figures for research papers. It usually works very good and I can seamlessly integrate my figures in latex documents. The figures thus produced is very high resolution and refined. However, the stumbling block is the high-resolution scatterplot of a large dataset - to tune of 100,000 points.

If I follow my usual tikz-pgfplot route, the latex file is produced but while compiling through pdflatex, one gets the tex memory exceeded... error. I also came to know that increasing tex's memory is not a good idea. So, I ended up producing an eps (encapsulated postscript) figure, which I then include in my latex document through tikz-pgfplot to render the annotations. It usually works but results a very large PDF file to the tune of 2 MB for a small figure and the PDF reader take long time to fully display figure.

I was wondering, if there any other ways to produce a high-resolution scatterplot of a large dataset? Any pointer would be highly appreciated.

Madhur

Madhurjya
  • 497
  • 5
  • 17

1 Answers1

6

Any vector format representation of 10^5 points is necessarily going to be large because each point is described separately even if it lies on top of or underneath many other points. The generic solution is to use a bitmap format for the plot, since each pixel in the plot is either set or not set no matter how many points lie on top of it. The size of the output representation is to first approximation not dependent on the number of points.

Sticking with gnuplot, I would probably use set terminal cairolatex png standalone to generate an initial plot description, followed by pdflatex to produce a final pdf with the bitmap embedded in it. E.g.:

# create a bitmapped version
set term cairolatex png standalone size 10cm, 7cm
set output 'cairolatex+png.tex'
set xrange [0:1]
set sample 100000
plot '+' using (rand(0)):(rand(0)) with dots
unset output
system("pdflatex cairolatex+png")

# create a vector version
set term tikz standalone size 10cm, 7cm
set output 'tikz.tex'
set sample 10000
replot
unset output
system("pdflatex tikz")

The first plot completes immediately and produces a smaller file. The second plot takes several minutes and produces a larger file despite containing only 1/10 the number of points.

[236] ls -s1 *.pdf
416 cairolatex+png.pdf
844 tikz.pdf

Both use latex for the text portions of the plot, although the default fonts may not be the same.

Ethan
  • 13,715
  • 2
  • 12
  • 21
  • Appreciate for the answer. However, for some reason, I don't have the `png` option for `gnuplot`'s `cairolatex` terminal. So, I used `cairopng` terminal which resulted the same graphics and now my final `pdf` figure size is about `18 KB` instead of earlier `1.8 MB`! Thanks! – Madhurjya Oct 20 '18 at 05:44