0

I have started to try to use the Google Cloud datalab. While I understand it is a Beta product, I find the Doc's very frustrating, to say the least.

The questions here and lack of responses as well as lack of new revisions or docs over the several months the project has been available make me wonder if there is any commitment to the product?

A beginning would be a notebook that shows data ingestion from external sources to both the datastore system and the Big query system. That is a common use case. I'd like to use my own data, it would be great to have a Notebook to ingest it. It seems that should be doable without huge effort? And it would get me (and others) out of this mess trying to link the various terse docs from various products and workspaces up and working together..

in addition to a better explanation of the Git hub connection process (prior question))

dartdog
  • 10,432
  • 21
  • 72
  • 121

2 Answers2

2

For BigQuery, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/BigQuery/Importing%20and%20Exporting%20Data.ipynb

For GCS, see here: https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/tutorials/Storage/Storage%20Commands.ipynb

Those are the only two storage options currently supported in Datalab (which should not be used in any event for large scale data transfers; these are for small scale transfers that can fit in memory in the Datalab VM).

For Git support, see https://github.com/GoogleCloudPlatform/datalab/blob/master/content/datalab/intro/Using%20Datalab%20-%20Managing%20Notebooks%20with%20Git.ipynb. Note that this has nothing to do with Github, however.

As for the low level of activity recently, that is because we have been heads down getting ready for GCP Next (which happens this coming week). Once that is over we should be able to migrate a number of new features over to Datalab and get a new public release out soon.

Graham Wheeler
  • 2,734
  • 1
  • 19
  • 23
  • Thank you for the pointers, I was seeming kind of grim on the activity front. And I do think the project has huge potential. I'd like to see it take off. – dartdog Mar 21 '16 at 13:51
  • None of these examples shows uploading from a local machine to the Cloud storage. ??? – dartdog Mar 21 '16 at 13:53
0

Datalab isn't running on your local machine. Just the presentation part is in your browser. So if you mean the browser client machine, that wouldn't be a good solution - you'd be moving data from the local machine to a VM which is running the Datalab Python code (and this VM has limited storage space), and then moving it again to the real destination. Instead, you should use the cloud console or (preferably) gcloud command line on your local machine for this.

Graham Wheeler
  • 2,734
  • 1
  • 19
  • 23
  • Well I get your point, but I do not get why those commands cannot run in the browser/notebook, to read from the local machine and write to the cloud storage containers.. might need a widget though and would massively help use ability for analysis. The browser can access both? – dartdog Mar 21 '16 at 14:05
  • The hassle here is getting stuff into and out of the environment with the right paths and ownership for analysis fro local storage. – dartdog Mar 21 '16 at 14:07
  • We don't really want to reimplement large parts of the cloud console in the Datalab UX, and that's what this would amount to (because going through the Datalab VM/Python code would be a complete non-starter for performance reasons). We will have an alternative approach soon that might satisfy your needs. – Graham Wheeler Mar 21 '16 at 14:12
  • all we would need is the ability and examples to access the cloud console using the Terminal facility in Jupyter and or vi's the Magic facility. – dartdog Mar 21 '16 at 14:21