3

I am having a problem with the autosave failing in a Google datalab notebook. I am using the ssh shell on port 8081 from the console. I open the notebook. Make a few changes. Click save. Works fine. I run the code, make another change. Click save. Autosave failed! It appears to only happen with this notebook. I am uncertain how to diagnose this.

Any thoughts?

Problem is reproducible.

UPDATE: I have now traced this down to a code cell that displays 16 scanned images, each sized 10MB. Is there a limit to the size of the output in a code window, or in a datalab notebook overall? Could this be the problem?

Brian F
  • 107
  • 1
  • 7

2 Answers2

3

The Jupyter version used in Datalab does not support uploading large files, which is causing this issue. When you try to save a large notebook (>~50MB or so), the upload is actually using Jupyter's file save API.

See https://github.com/googledatalab/datalab/issues/1324.

A workaround would be to not actually save the images if they're so large in the notebook, and only save their URLs if they're hosted somewhere. Or if you can compress them somehow, the goal is to get the notebook's size to a manageable number.

Ultimately, Datalab should upgrade to using notebook==5.0.0, which fixed this issue. Feel free to express your +1 on that issue. :)

yelsayed
  • 5,236
  • 3
  • 27
  • 38
  • Thanks. I am going to just resize them for display. I'll put a vote in for the change to datalab too. – Brian F Jul 20 '17 at 03:56
0

I was facing the similar issue. What worked for me was I stored the data in Google Cloud storage and read the data to notebook from there only. After that the datalab notebook worked fine.

import google.datalab.storage as storage
import pandas as pd
from io import BytesIO

mybucket = storage.Bucket('$Bucket_name')
data_csv = mybucket.object('$file_name')

uri = data_csv.uri
%gcs read --object $uri --variable data

df = pd.read_csv(BytesIO(data))
df.head()