8

I am generating and saving SVG images using matplotlib and would like to make them as reproducible as possible. However, even after setting np.random.seed and random.seed, the various id and xlink:href values in the SVG images still change between runs of my code.

I assume these differences are due to the backend that matplotlib uses to render SVG images. Is there any way to set the seed for this backend such that identical plots produce identical output between two different runs of the code?

Sample code (run this twice, changing the name in plt.savefig for the second run):

import random
import numpy as np
import matplotlib.pyplot as plt

random.seed(42)
np.random.seed(42)

x, y = np.random.randn(4096), np.random.randn(4096)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=(64,64))

fig, axis = plt.subplots()
plt.savefig("random_1.svg")

Compare files:

diff random_1.svg random_2.svg | head
35c35
< " id="md3b71b67b7" style="stroke:#000000;stroke-width:0.8;"/>
---
> " id="m7ee1b067d8" style="stroke:#000000;stroke-width:0.8;"/>
38c38
<        <use style="stroke:#000000;stroke-width:0.8;" x="57.6" xlink:href="#md3b71b67b7" y="307.584"/>
---
>        <use style="stroke:#000000;stroke-width:0.8;" x="57.6" xlink:href="#m7ee1b067d8" y="307.584"/>
82c82
<        <use style="stroke:#000000;stroke-width:0.8;" x="129.024" xlink:href="#md3b71b67b7" y="307.584"/>
saladi
  • 3,103
  • 6
  • 36
  • 61

1 Answers1

11

There is an option svg.hashsalt in matplotlib's rcParams which seems to be used exactly for that purpose:

# svg backend params
#svg.image_inline : True       # write raster image data directly into the svg file
#svg.fonttype : 'path'         # How to handle SVG fonts:
#    'none': Assume fonts are installed on the machine where the SVG will be viewed.
#    'path': Embed characters as paths -- supported by most SVG renderers
#    'svgfont': Embed characters as SVG fonts -- supported only by Chrome,
#               Opera and Safari
svg.hashsalt : None           # if not None, use this string as hash salt
                              # instead of uuid4

The following code produces two exactly identical files, down to the XML ids

import numpy             as np
import matplotlib        as mpl
import matplotlib.pyplot as plt

mpl.rcParams['svg.hashsalt'] = 42
np.random.seed(42)

x, y = np.random.randn(4096), np.random.randn(4096)

fig, ax = plt.subplots()
ax.hist(x)

for i in [1,2]:
    plt.savefig("random_{}.svg".format(i))
Tom de Geus
  • 5,625
  • 2
  • 33
  • 77
Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75
  • 2
    Note that the hashsalt is really supposed to be a string (see the quoted documentation, and a use here: https://github.com/matplotlib/matplotlib/blob/ec8e119235b2d4e750514c1cc8d6192f5465aa9c/lib/matplotlib/testing/__init__.py#L28) matplotlib automatically converts it into a string for you in https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/rcsetup.py – Gus Jun 16 '19 at 01:44