18

Hej,

I'd like to produce high quality PDFs from matplotlib plots. Using other code, I have produced a large array of numbers, which I plot in a figure using plt.imshow. If I now produce a PDF using plt.savefig, I notice strong differences depending on which backend I use. Most importantly, the produced files get huge with the Agg or MacOSX backend, while they are reasonably small with Cairo (see examples below). On the other hand, the Cairo backend produces weird text in conjunction with the TeX rendering of labels. This looks awful in the TeX document. My question is therefore twofold:

  1. Is it possible to produce small PDF (i.e. presumably without interpolating the raster image to a higher resolution) using the Agg backend?
  2. Can one change some text settings for the Cairo backend such that it looks similar to ordinary TeX (which is the case for the Agg backend)

Here is some example code for test purposes:

import matplotlib as mpl
mpl.use( "cairo" )

import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['text.usetex'] = True

data = np.random.rand( 50, 50 )

plt.imshow( data, interpolation='nearest' )
plt.xlabel( 'X Label' )
plt.savefig( 'cairo.pdf' )

produces a PDF of 15Kb with a bad looking xlabel.

import matplotlib as mpl
mpl.use( "agg" )

import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['text.usetex'] = True

data = np.random.rand( 50, 50 )

plt.imshow( data, interpolation='nearest' )
plt.xlabel( 'X Label' )
plt.savefig( 'agg.pdf' )

produces a PDF of 986Kb which looks good.

I should probably add that I use matplotlib 1.0.1 with python 2.6.7 on OSX 10.6.8. In the comments, someone requested the output of grep -a Font agg.pdf:

/Shading 6 0 R /Font 3 0 R >>
<< /FontFile 16 0 R /Descent -285 /FontBBox [ -174 -285 1001 953 ]
/StemV 50 /Flags 4 /XHeight 500 /Type /FontDescriptor
/FontName /NimbusSanL-Regu /CapHeight 1000 /FontFamily (Nimbus Sans L)
%!PS-AdobeFont-1.0: NimbusSanL-Regu 1.05a
FontDirectory/NimbusSanL-Regu known{/NimbusSanL-Regu findfont dup/UniqueID known{dup
/UniqueID get 5020902 eq exch/FontType get 1 eq and}{pop false}ifelse
/FontType 1 def
/FontMatrix [0.001 0 0 0.001 0 0 ]readonly def
/FontName /NimbusSanL-Regu def
/FontBBox [-174 -285 1001 953 ]readonly def
/FontInfo 9 dict dup begin
/BaseFont /NimbusSanL-Regu /Type /Font /Subtype /Type1
/FontDescriptor 15 0 R /Widths 13 0 R /LastChar 255 /FirstChar 0 >>
<< /FontFile 20 0 R /Descent -251 /FontBBox [ -34 -251 988 750 ] /StemV 50
/Flags 4 /XHeight 500 /Type /FontDescriptor /FontName /CMR12
/CapHeight 1000 /FontFamily (Computer Modern) /ItalicAngle 0 /Ascent 750 >>
%!PS-AdobeFont-1.0: CMR12 003.002
%Copyright:  (<http://www.ams.org>), with Reserved Font Name CMR12.
% This Font Software is licensed under the SIL Open Font License, Version 1.1.
FontDirectory/CMR12 known{/CMR12 findfont dup/UniqueID known{dup
/UniqueID get 5000794 eq exch/FontType get 1 eq and}{pop false}ifelse
/FontType 1 def
/FontMatrix [0.001 0 0 0.001 0 0 ]readonly def
/FontName /CMR12 def
/FontBBox {-34 -251 988 750 }readonly def
/FontInfo 9 dict dup begin
 /Notice (Copyright \050c\051 1997, 2009 American Mathematical Society \050<http://www.ams.org>\051, with Reserved Font Name CMR12.) readonly def
<< /BaseFont /CMR12 /Type /Font /Subtype /Type1 /FontDescriptor 19 0 R
David Zwicker
  • 23,581
  • 6
  • 62
  • 77
  • I don't know the answer, but would you please post `grep -a Font agg.pdf`? – unutbu Sep 08 '11 at 10:33
  • I put the output into the main post, since the comment section is too small to hold it. Thanks for making an effort! I also suspected that there could be a problem with the fonts. I basically try to use Computer Modern, in order to match my TeX document. – David Zwicker Sep 08 '11 at 11:23
  • Have a look at this question about matplotlib and cairo, it may give you some hints. http://stackoverflow.com/questions/2797525/matplotlib-pdf-export-uses-wrong-font – James Hurford Sep 08 '11 at 11:50
  • I have indeed looked at this post, but I think the font properties mentioned there only apply when using matplotlib to render text (instead of TeX). I'm not sure, but it also seems as if the cairo backend is able to find the right font, but the text seems fainter (anti-alaising?) and is sometimes shifted a little bit. – David Zwicker Sep 08 '11 at 14:13
  • I have tried both your examples (python 2.7.2 mpl 1.0.1 python2-cairo 1.10.0) and found the following: the `plt.rcParams['text.usetex'] = True` causes an error when I try to execute `plt.savefig(...)` so I removed it. Then I get 15.5Kb for cairo.pdf and 18.3Kb for agg.pdf, both look the same. Can't you compress the agg.pdf afterwards? – steabert Sep 18 '11 at 07:28
  • The usetex option does work when I save to a postscript file instead, I can then convert that to pdf and it looks nice. – steabert Sep 18 '11 at 07:37
  • Thank you for your comments! I definitely want to use the text.usetex option, so your first option is not very convenient for me. The second alternative of exporting in some other format works brilliant, though! I got even better results with exporting as eps and then using `epspdf` to convert it to PDF. Using PS export lost the bounding box in my case, a problem which I did not pursue, since the EPS works fine. Thanks again! – David Zwicker Sep 23 '11 at 13:33

1 Answers1

6

As suggested by steabert in the comments above, a workaround is exporting the graphics in a different format and then convert it to PDF afterwards. Adjusting my example from above, the workflow could look something like this:

import os
import matplotlib as mpl
mpl.use("Agg")

import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['text.usetex'] = True

data = np.random.rand(50, 50)

plt.imshow(data, interpolation='nearest')
plt.xlabel('X Label')
plt.savefig('agg.eps')

os.system('epspdf agg.eps agg.pdf')

producing a file of 16 Kb which looks good. There is still one difference to the examples presented above: Using the (E)PS pipeline seems to ignore the interpolation='nearest' option, i.e. the image appears blurry in the final PDF. Luckily, I can live with that, but it might be interesting to look into this issue.

David Zwicker
  • 23,581
  • 6
  • 62
  • 77