5

Migrating a legacy project from 2.7 and Ubuntu 18.04 [piecemeal, Python 3.10 & 22.04 next… then Flask!] from vendored dependencies to requirements.txt. Removed dependencies from project root and enumerated them in my requirements.txt.

My requirements.txt contains google-cloud-storage==1.44.0 and was venv-2-7/bin/python -m pip install -t lib -r requirements.txt with a appengine_config.py in same dir as app.yaml with:

# From https://cloud.google.com/appengine/docs/legacy/standard/python/tools/using-libraries-python-27
import os

from google.appengine.ext import vendor

vendor.add('lib')
vendor.add(os.path.join(os.path.dirname(os.path.realpath(__file__)), 'lib'))

How do I resolve this error? - Attempted venv-2-7/bin/python -c 'import google.cloud.storage' which worked, but:

$ venv-2-7/bin/python /google-cloud-sdk/platform/google_appengine/dev_appserver.py --host 127.0.0.1 .

Errors [from PyCharm & manually] with:

ImportError: No module named google.cloud.storage

EDIT0: Including more information [below] as per comment requests:

app.yaml
runtime: python27
api_version: 1
threadsafe: yes

instance_class: F4
automatic_scaling:
  min_idle_instances: automatic
    
handlers:
- url: /1/account.*
  script: api_account.app

inbound_services:
- warmup

libraries: # Also tried removing this section entirely
- name: webapp2
  version: latest
- name: jinja2
  version: latest
- name: ssl
  version: latest

env_variables:
  PYTHONHTTPSVERIFY: 1

skip_files:
- ^(.*/)?.*\.py[co]$

EDIT1: Tried both solutions independently and even together:

import os
import sys

import pkg_resources
import google

if os.environ.get("GOOGLE_CLOUD_SDK_APPENGINE"):
    sys.path.insert(0, os.environ["GOOGLE_CLOUD_SDK_APPENGINE"])

for lib_dir in os.path.join(os.path.dirname(__file__), 'lib'), 'lib':
    sys.path.insert(0, lib_dir)
    google.__path__.append(os.path.join(lib_dir, 'google'))
    pkg_resources.working_set.add_entry(lib_dir)

import google.cloud.storage

Is there some trick with from google.appengine.ext import vendor, should I not be using /google-cloud-sdk/platform/google_appengine as my GOOGLE_CLOUD_SDK_APPENGINE env var?

EDIT2: I tried inlining google.appengine.ext.vendor and calling it in the loop, which gave me a ImportError: No module named google.cloud._helpers error

Samuel Marks
  • 1,611
  • 1
  • 20
  • 25
  • Sorry, your question isn't really quite clear. Are you running Python 3 and still vendoring your library + using ```appengine_configy.py```? – NoCommandLine Aug 19 '22 at 21:59
  • Python 2.7 only. Later I will upgrade to 3.10. – Samuel Marks Aug 20 '22 at 14:03
  • Could be because the latest version of google-cloud-storage which you installed ONLY supports ```Python >=3.7``` - https://pypi.org/project/google-cloud-storage/ – NoCommandLine Aug 20 '22 at 15:25
  • As I referenced, using this version https://pypi.org/project/google-cloud-storage/1.44.0/ which still supports 2.7. – Samuel Marks Aug 21 '22 at 02:08
  • I deployed a small py2.7 app with vendored `google-cloud-storage v1.44.0` such as yours, but could not see this error. You should add more context to the question, like your `app.yaml` file and the code importing the Cloud Storage library. Can you also check if you find the GCS package within your lib folder, and in `/lib/google/cloud/storage` as well? As a note, in your `appengine_config.py` both `vendor.add()` lines achieve the same, according to the [documentation](https://cloud.google.com/appengine/docs/legacy/standard/python/tools/using-libraries-python-27#copying_a_third-party_library). – ErnestoC Aug 23 '22 at 20:24

3 Answers3

1

I didn't try to repro your error but believe you. From initial glance, it looks like your appengine_config.py is incomplete. This suffices when you have non-GCP 3P dependencies:

from google.appengine.ext import vendor

# Set PATH to your libraries folder.
PATH = 'lib'
# Add libraries installed in the PATH folder.
vendor.add(PATH)

However, if your requirements.txt has GCP client libraries, e.g., google-cloud-*, your appengine_config.py needs to use pkg_resources to support their use:

import pkg_resources
from google.appengine.ext import vendor

# Set PATH to your libraries folder.
PATH = 'lib'
# Add libraries installed in the PATH folder.
vendor.add(PATH)
# Add libraries to pkg_resources working set to find the distribution.
pkg_resources.working_set.add_entry(PATH)

To bring in pkg_resources, you need to add setuptools and grpcio to your app.yaml, and to use google-cloud-storage specifically, you need to also add ssl:

runtime: python27
threadsafe: yes
api_version: 1

handlers:
- url: /.*
  script: main.app

libraries:
- name: grpcio
  version: latest
- name: setuptools
  version: latest
- name: ssl
  version: latest

All of these 3P pkg games "go away" when you finally upgrade to Python 3 where your requirements.txt remains the same, but you delete appengine_config.py and replace your app.yaml with the following (if you're not serving static files):

runtime: python310

These same instructions can also be found in the App Engine documentation on the migrating bundled services page. That page basically says what I just did above for both Python 2 and 3.

<ADVERTISEMENT>

When you're ready to upgrade to Python 3 and/or get off App Engine bundled services (NDB, Task Queue [push & pull], Memcache, Blobstore, etc.) to standalone Cloud equivalents (Cloud NDB, Cloud Tasks [push] or Cloud Pub/Sub [pull], Cloud Memorystore, Cloud Storage (GCS), etc.), or switch to Cloud Functions or Cloud Run, I've produced (well, still producing) a modernization migration series complete with code samples, codelab tutorials, and videos, all of which complement the official migration docs. You can find more info including links to all those resources as its open source repo. In particular, your inquiry covers GCS, and that migration is covered by "Module 16." The Mod16 app is a sample that works with GCS and is a migration of its analog Mod15 app based on Blobstore.

</ADVERTISEMENT>

wescpy
  • 10,689
  • 3
  • 54
  • 53
  • This solution did not work. Is there something I'm missing? - See my question for the edits. – Samuel Marks Sep 02 '22 at 18:58
  • 0) It "did not work:" share *what* didn't work, including the errors. 1) You're making things too complicated; there's no need for an `appengine_config.py` that complex; just use the one I provided (otherwise the "it" differs). 2) `GOOGLE_CLOUD_SDK_APPENGINE` isn't a supported environment variable; see [this list](https://cloud.google.com/appengine/docs/legacy/standard/python/runtime#environment_variables_2) of supported ones. 3) It's a configuration file, so don't import the GCS client library there; do that in `main.py` 4) Suggest reducing what you deploy to a Hello World GCS app. – wescpy Sep 09 '22 at 05:57
  • yeah I know, it keeps getting more complex (because I thought I was doing something wrong). The error I get is `ImportError: No module named google.cloud.storage`. Is there some way to reset the module search hierarchy? - Because it's not finding `six.py` now, even when it is clearly present in `sys.path`. – Samuel Marks Sep 09 '22 at 14:08
  • Again, reduce it to a Hello World app where `requirements.txt` only has `six` and `google-cloud-storage`, `appengine_config.py` only has the 5 lines from my version, and `main.py` only has `import six` and `from google.cloud import storage` plus whatever you use to return a 200. For a full sample, see https://github.com/googlecodelabs/migrate-python2-appengine/tree/master/mod16-cloudstorage – wescpy Sep 15 '22 at 23:29
0

I can reproduce your error.

I used to have problems with protobuf (the error was somewhat similar to what you're getting for storage). The solution was to update the google namespace package. I tried the same solution now (for storage) and your error went away

Update the code in your appengine_config.py to

    from google.appengine.ext import vendor
    import google, os

    lib_dir = os.path.join(os.path.dirname(__file__), 'lib')
    google.__path__.append(os.path.join(lib_dir, 'google'))

    vendor.add('lib')

After updating appengine_config.py, the line import google.cloud.storage no longer gives error of No module named cloud.storage.

NoCommandLine
  • 5,044
  • 2
  • 4
  • 15
  • This solution did not work. Is there something I'm missing? - See my question for the edits. – Samuel Marks Sep 02 '22 at 18:58
  • Just to be clear, you tried my own code i.e you added it to appengine_config.py and you were still getting exactly the same error as you were getting before? If you're getting a different error, please state it – NoCommandLine Sep 02 '22 at 20:02
  • Yes I am still getting the `No module named cloud.storage` error. – Samuel Marks Sep 02 '22 at 21:03
  • Why are you trying to set the following env variables - GOOGLE_CLOUD_SDK_APPENGINE. I printed my env variables and don't see it listed – NoCommandLine Sep 03 '22 at 06:59
  • I manually set them. To infer this see my answer, I did a big hack to set `appengine_python_sdk` from a Python object in the `os.environ`. – Samuel Marks Sep 04 '22 at 00:39
  • There's an environment variable for that - ```CLOUDSDK_PYTHON```. Try this. Create a basic App just for 'Hello World'. Add your command for ```import google.cloud.storage```. Keep the contents of your ```lib``` folder. Use exactly what I have in ```appengine_config.py```. In your shell, first set the env variable ```CLOUDSDK_PYTHON``` and then try running your app with ```dev_appserver.py``` – NoCommandLine Sep 04 '22 at 09:21
  • Regardless it's not working :\ – Samuel Marks Sep 06 '22 at 02:20
  • What is your OS or Platform? – NoCommandLine Sep 06 '22 at 06:14
  • Locally I am running macOS 12.5 (21G72) with gcloud 400.0.0 (app-engine-python 1.9.101, app-engine-python-extras 1.9.96) on a freshly compiled and then `virutalenv`'d Python 2.7.18. – Samuel Marks Sep 06 '22 at 14:11
0

Went nuclear with this solution, I hate this with a passion:

import inspect
import io
import os.path
import site
import sys
from collections import deque
from copy import deepcopy
from itertools import chain

from distutils.sysconfig import get_python_lib
from functools import partial
from os import getcwd, listdir

import google
site_packages = os.path.join(
    os.path.dirname(os.path.dirname(sys.executable)), get_python_lib(prefix="")
)
new_sys_path = [getcwd(), site_packages]

lib_dir = os.path.join(os.path.dirname(__file__), "lib")
new_sys_path.append(lib_dir)

appengine_python_sdk = (
    os.environ["GOOGLE_CLOUD_SDK_APPENGINE"]
    if os.environ.get("GOOGLE_CLOUD_SDK_APPENGINE")
    else os.path.dirname(
        os.path.dirname(
            os.path.dirname(
                os.path.dirname(
                    os.path.dirname(
                        inspect.getsourcefile(os.environ["wsgi.errors"].__class__)
                    )
                )
            )
        )
    )
)
lib_dir = os.path.join(os.path.dirname(__file__), "lib")

google_cloud_sdk_appengine = os.path.dirname(os.path.dirname(appengine_python_sdk))

non_venv_site_packages = os.path.dirname(inspect.getsourcefile(io))

google_cloud_sdk_appengine_lib = os.path.join(google_cloud_sdk_appengine, "lib")


def all_pkgs_for_dir(p):
    return map(
        partial(os.path.join, p),
        filter(
            lambda p: (lambda parts: parts[1] == "-" and parts[2][0].isdigit())(
                p.rpartition("-")
            ),
            listdir(p),
        ),
    )


app_yaml_libraries = (
    lambda g: [p[len(g) :].partition("-")[0] for p in sys.path if p.startswith(g)]
)(google_cloud_sdk_appengine_lib + os.path.sep)

Then construct the new_sys_path with the paths in the right order:

new_sys_path = list(
    chain.from_iterable(
        (
            (
                getcwd(),
                appengine_python_sdk,
                os.path.dirname(os.path.dirname(appengine_python_sdk)),
                non_venv_site_packages,
                os.path.join(non_venv_site_packages, "lib-dynload"),
            ),
            (
                # Use new dependencies from `lib` dir if found
                os.path.join(lib_dir, lib_p)
                for lib_p in os.listdir(lib_dir)
                if not lib_p.endswith(".dist-info")
                and lib_p.partition("-")[0] in app_yaml_libraries
            ),
            (
                lib_dir,
                site_packages,
                google_cloud_sdk_appengine_lib,
            ),
        )
    )
)

sys.path = deepcopy(new_sys_path)
google.__path__ = deepcopy(new_sys_path)
deque(map(site.addsitedir, new_sys_path), maxlen=0)

If anyone has a less hacky solution I'm all ears …

FYI: Currently debugging a ImportError: No module named six error, even though os.path.isfile(os.path.join(lib_dir, "six.py"))

Samuel Marks
  • 1,611
  • 1
  • 20
  • 25