4

I may be doing something really stupid, but I've been using plotly offline in my jupyter notebook using

import plotly.offline as py
py.init_notebook_mode(connected=True)
from plotly.graph_objs import *

I'm trying to display a sequence of images that can be navigated with a slider. The entire numpy array with the image data is 50 images x 64 wide x 64 tall.

I put that into the following slider function I pieced together from code I found online. The Figure object itself that's returned is not very large. However, when plotly's iplot is called, the size of my jupyter notebook on disk (as measured by ls -l) is really big - like 15 MB, even though the numpy source data is like 1MB. This becomes unmanageable for larger/multiple figures. Does anyone know what's going on?

def slider_ims(imgs):

    imgs = np.flip(imgs,1) 

    data = [dict(
            type='heatmap',
            z = imgs[step,:,:],
            visible = False,
            showscale=False,
            xaxis="x",
            yaxis="y",
            name = 'z = '+str(step)) for step in np.arange(imgs.shape[0])]
    data[0]['visible'] = True


    steps = []
    for i in range(len(data)):
        step = dict(
            method = 'restyle',
            args = ['visible', [False] * len(data)],
            label = str(i)
        )
        step['args'][1][i] = True # Toggle i'th trace to "visible"
        steps.append(step)

    sliders = [dict(
        active = 0,
        currentvalue = {"prefix": "Frame: "},
        pad = {"t": 50},
        steps = steps,
        ticklen = 0,
        minorticklen = 0
    )]

    layout = Layout(
             sliders = sliders,
             font=Font(family='Balto'),
             width=800,
             height=600,
            )


    fig=Figure(data=data, layout=layout)
    py.iplot(fig)
    return fig
meldefon
  • 41
  • 1
  • 2

3 Answers3

2

You want smaller ipynb files? Don't store output cells.

If you only care about the on-disk size of your notebooks, you could change your Jupyter configuration to disable writing output cells to the ipynb file. This would mean that only your code is saved on disk. Whenever you open a notebook, the output cells will be empty and you need to re-run the notebook to get them. You have to decide whether this fits with how you use notebooks.

You can set this up by editing your jupyter_notebook_config.py configuration file, which is typically located in your home directory under ~/.jupyter (Windows: C:\Users\USERNAME\.jupyter\). If it does not exist yet, this file can be generated from the termial with jupyter notebook --generate-config (more info here).

In this configuration file, you need to add a pre-save hook that strips output cell before saving as described in the documentation:

def scrub_output_pre_save(model, **kwargs):
    """scrub output before saving notebooks"""
    # only run on notebooks
    if model['type'] != 'notebook':
        return
    # only run on nbformat v4
    if model['content']['nbformat'] != 4:
        return

    for cell in model['content']['cells']:
        if cell['cell_type'] != 'code':
            continue
        cell['outputs'] = []
        cell['execution_count'] = None

c.FileContentsManager.pre_save_hook = scrub_output_pre_save

Bonus benefit: Stripping output cells like this is also a great way to get readable diffs for source control, e.g. git.

mad
  • 320
  • 1
  • 11
1

Normally, plotly's plot has a big size. Your notebook size increased because you save the plot on your notebook using inline plot (py.iplot).
If you don't want your notebook to be so large, just use the normal plot (py.plot) and save the plot in another file.
You can read the plotly's documentation

Iqbal Basyar
  • 167
  • 1
  • 10
0

In my case, I used as follows to avoid saving the plot along with the notebook. I prefer to save the image as a different page to keep the notebook's size.

import plotly.offline as pyo

pyo.plot(fig, filename="example.html")
Samir Hinojosa
  • 825
  • 7
  • 24