1

I'm trying to load a model on AWS lambda using Zappa. The problem is that the total unzipped file size from the package created by Zappa and uploaded to S3 is about 550mb, which exceeds the limit. One of the packages I'm using is Spacy (an NLP dependency that is very large), and I'm able to reduce the size of this package by manually removing unused languages in the lang folder. Doing this I can get the unzipped file size under 500mb. Problem is that Zappa automatically downloads the full Spacy version (spacy==2.1.4: Using locally cached manylinux wheel) on deploy and update.

I've learned that I can call Zappa Package, and it will generate a package that I can then upload myself. What I've done is unzipped the generated package and removed the unnecessary lang files, then I zipped it back up. Is it possible for me to call Zappa Deploy/Update and use the modified package and handler that was created by Zappa Package? This way Zappa can still handle the deployment.

Rene B.
  • 6,557
  • 7
  • 46
  • 72
Negative Correlation
  • 813
  • 1
  • 11
  • 26

2 Answers2

0

For me the following two things fixed that issue:

  1. AWS Lambda requires your environment to have a maximum size of 50mb, but our packaged environment will be around 100mb. Lucky for us, it is possible for Lambda’s to load code from Amazon S3 without much performance loss (only a few milliseconds).

To activate this feature, you must add a new line to your zappa_settings.json

"slim_handler": true
  1. Installing only spacy with and not the language packages (python3 -m spacy download en). Afterwards, I uploaded the language package manually to S3 and then loaded the spacy language "model" similar as described here: Sklearn joblib load function IO error from AWS S3
Rene B.
  • 6,557
  • 7
  • 46
  • 72
0

Here's how I solved the issue, there are two ways:

  1. The first is to simply move the dependency folder from the site-packages directory to the root folder, and then make any modifications there. This will force zappa to not download a wheels on linux version of the dependency upon upload
  2. The simpler solution is to remove the *dist folder for a specific module that you modify. Removing this will force zappa to bypass re-downlading modules from wheels on linux; meaning your modified module will be packaged during deployment.
Negative Correlation
  • 813
  • 1
  • 11
  • 26