How to store data in GCS while accessing it from GAE and 'GCE' locally

Question

There's a GAE project using the GCS to store/retrieve files. These files also need to be read by code that will run on GCE (needs C++ libraries, so therefore not running on GAE).

In production, deployed on the actual GAE > GCS < GCE, this setup works fine. However, testing and developing locally is a different story that I'm trying to figure out.

As recommended, I'm running GAE's dev_appserver with GoogleAppEngineCloudStorageClient to access the (simulated) GCS. Files are put in the local blobstore. Great for testing GAE.

Since these is no GCE SDK to run a VM locally, whenever I refer to the local 'GCE', it's just my local development machine running linux. On the local GCE side I'm just using the default boto library (https://developers.google.com/storage/docs/gspythonlibrary) with a python 2.x runtime to interface with the C++ code and retrieving files from the GCS. However, in development, these files are inaccessible from boto because they're stored in the dev_appserver's blobstore.

Is there a way to properly connect the local GAE and GCE to a local GCS?

For now, I gave up on the local GCS part and tried using the real GCS. The GCE part with boto is easy. The GCS part is also able to use the real GCS using an access_token so it uses the real GCS instead of the local blobstore by:

cloudstorage.common.set_access_token(access_token)

According to the docs:

access_token: you can get one by run 'gsutil -d ls' and copy the
  str after 'Bearer'.

That token works for a limited amount of time, so that's not ideal. Is there a way to set a more permanent access_token?

score 1 · Answer 1 · answered Jan 08 '16 at 17:41

There is convenience option to access Google Cloud Storage from development environment. You should use client library provided with Google Cloud SDK. After executing gcloud init locally you get access to your resources.

As shown in examples to Client library authentication:

# Get the application default credentials. When running locally, these are
# available after running `gcloud init`. When running on compute
# engine, these are available from the environment.
credentials = GoogleCredentials.get_application_default()

# Construct the service object for interacting with the Cloud Storage API -
# the 'storage' service, at version 'v1'.
# You can browse other available api services and versions here:
#     https://developers.google.com/api-client-library/python/apis/
service = discovery.build('storage', 'v1', credentials=credentials)

max · Answer 2 · 2020-11-22T13:29:18.210

1

Google libraries come and go like tourists in a train station. Today (2020) google-cloud-storage should work on GCE and GAE Standard Environment with Python 3.

On GAE and CGE it picks up access credentials from the environment and locally you can provide it whit a servce account JSON-file like this:

GOOGLE_APPLICATION_CREDENTIALS=../sa-b0af54dea5e.json

edited Nov 22 '20 at 13:29

answered Nov 20 '20 at 12:37

max

29,122
12
52
79

score 0 · Answer 3 · answered Jul 28 '16 at 17:54

If you're always using "real" remote GCS, the newer gcloud is probably the best library: http://googlecloudplatform.github.io/gcloud-python/

It's really confusing how many storage client libraries there are for Python. Some are for AE only, but they often force (or at least default to) using the local mock Blobstore when running with dev_appserver.py.

Seems like gcloud is always using the real GCS, which is what I want. It also "magically" fixes authentication when running locally.

score -1 · Answer 4 · answered Feb 20 '16 at 04:21

It looks like appengine-gcs-client for Python is now only useful for production App Engine and inside dev_appserver.py, and the local examples for it have been removed from the developer docs in favor of Boto :( If you are deciding not to use the local GCS emulation, it's probably best to stick with Boto for both local testing and GCE.

If you still want to use 'google.appengine.ext.cloudstorage' though, access tokens always expire so you'll need to manually refresh it. Given your setup honestly the easiest thing to to is just call 'gsutil -d ls' from Python and parse the output to get a new token from your local credentials. You could use the API Client Library to get a token in a more 'correct' fashion, but at that point things would be getting so roundabout you might as well just be using Boto.

score -3 · Answer 5 · answered Jan 15 '14 at 05:01

-3

There is a Google Cloud Storage local / development server for this purpose: https://developers.google.com/datastore/docs/tools/devserver

Once you have set it up, create a dataset and start the GCS development server

gcd.sh create [options] <dataset-directory>
gcd.sh start [options] <dataset-directory>

Export the environment variables

export DATASTORE_HOST=http://yourmachine:8080
export DATASTORE_DATASET=<dataset_id>

Then you should be able to use the datastore connection in your code, locally.

answered Jan 15 '14 at 05:01

Hanxue

12,243
18
88
130

6

This is for local *Datastore*. I think the OP is trying to access real *Google Cloud Storage* from a local GAE server. – Su Zhang Mar 11 '14 at 18:13
1

Indeed, Google Cloud Storage is not Google Cloud Datastore. – kvdb Apr 18 '14 at 13:38

How to store data in GCS while accessing it from GAE and 'GCE' locally

5 Answers5