7

I am having trouble creating a lambda layer for the xgboost library. Im running:

Im grabbing a zip of xgboost and it's dependencies from here (https://github.com/alexeybutyrev/aws_lambda_xgboost) and loading it into a layer. When I try to test my lambda, I get this error:

Unable to import module 'lambda_function': No module named 'xgboost.core'

It looks like __init__.py is trying to reference core.py via from .core import <stuff>

Has anyone encountered this error with AWS Lambda before?

kichik
  • 33,220
  • 7
  • 94
  • 114
Alex
  • 280
  • 3
  • 18

2 Answers2

8

EDIT: As @Marcin has remark, the first answer provided works for packages under 262 MB large.

A. Python Packages within Lambda Layer size limit

You can also do it with AWS sam cli and Docker (see this link to install the SAM cli), to build the packages inside a container. Basically you initialize a default template with Python as runtime and then you specify the packages under the requirements.txt file. I found it more easy than the article you mentioned. I let you steps if you want to consider them for future use.

1. Initialize a default SAM template

Under any folder that you want to keep the project, you can type

sam init

this will prompt a series of questions, for a quick set up we will be choosing the Quick Start Templates as follows

1 - AWS Quick Start Templates

2 - Python 3.8

Project name [sam-app]: your_project_name

1 - Hello World Example

By choosing the Hello World Example it generates a default lambda function with a requirements.txt file. Now, we're going to edit with the name of the package that you want, in this case xgboost

2. Specify packages to install

cd your_project_name
code hello_world/requirements.txt

as I have Visual Studio Code as editor, this will open the file on it. Now, I can specify the xgboost package

your_python_package

Here comes the reason to have Docker installed. Some packages relied on C++. Thus, it is recommended to build inside a container (case on Windows). Now, move to the folder where the template.yaml file is located. Then, type

sam build -u

3. Zip packages

there are some files that you do not want to be included in your lambda layer, because we only want to keep the python libraries. Thus, you could remove the following files

rm .aws-sam/build/HelloWorldFunction/app.py
rm .aws-sam/build/HelloWorldFunction/__init__.py
rm .aws-sam/build/HelloWorldFunction/requirements.txt

and then zip the remaining content of the folder.

cp -r .aws-sam/build/HelloWorldFunction/ python/
zip -r my_layer.zip python/

where we place the layer in the python/ folder according to the docs On Windows system the zip command should be replaced with Compress-Archive my_layer/ my_layer.zip.

4. Upload your Layer to AWS

On AWS go to Lambda, then choose Layers and Create Layer. Now, you can upload your .zip file as the image below shows

enter image description here

Notice that for zip files over 50 MB, you should upload the .zip file to an s3 bucket and provide the path, for exampl, https://s3:amazonaws.com//mybucket/my_layer.zip.

B. Python packages that exceeds Lambda Layer limits

The xgboost package on its own is more than 300 MB and will throw the following error

enter image description here

As @Marcin has kindly pointed out, the prior approach with SAM cli would not directly work for Python layers that exceed the limit. There's an open issue on github to specify a custom docker image when running sam build -u and a possible solution retagging the default lambda/lambci image.

So, how could we pass through this?. There are already some useful resources that I would just point to.

  • First, the Medium article that @Alex took as solution that follow this repo code.
  • Second, alexeybutyrev approach that works by applying the strip command to reduce the libraries sizes. One can find this approach under a github repo, the instructions are provided.

Edit (December 2020)

This month AWS releases container Image support for AWS Lambda. Following the next tree structure for your project

Project/
|-- app/
|   |-- app.py
|   |-- requirements.txt
|   |-- xgb_trained.bin
|-- Dockerfile
 

You can deploy an XGBoost model with the following Docker image. Follow this repo instructions for a detailed explanation.

# Dockerfile based on https://docs.aws.amazon.com/lambda/latest/dg/images-create.html

# Define global args
ARG FUNCTION_DIR="/function"
ARG RUNTIME_VERSION="3.6"

# Choose buster image
FROM python:${RUNTIME_VERSION}-buster as base-image

# Install aws-lambda-cpp build dependencies
RUN apt-get update && \
  apt-get install -y \
  g++ \
  make \
  cmake \
  unzip \
  libcurl4-openssl-dev \
  git


# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Create function directory
RUN mkdir -p ${FUNCTION_DIR}

# Copy function code
COPY app/* ${FUNCTION_DIR}/

# Install python dependencies and runtime interface client
RUN python${RUNTIME_VERSION} -m pip install \
                   --target ${FUNCTION_DIR} \
                   --no-cache-dir \
                   awslambdaric \
                   -r ${FUNCTION_DIR}/requirements.txt

# Install xgboost from source
RUN git clone --recursive https://github.com/dmlc/xgboost
RUN cd xgboost; make -j4; cd python-package; python${RUNTIME_VERSION} setup.py install; cd;

# Multi-stage build: grab a fresh copy of the base image
FROM base-image

# Include global arg in this stage of the build
ARG FUNCTION_DIR

# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}

# Copy in the build image dependencies
COPY --from=base-image ${FUNCTION_DIR} ${FUNCTION_DIR}

ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]

CMD [ "app.handler" ]
Miguel Trejo
  • 5,913
  • 5
  • 24
  • 49
  • Hi. Just curious. What is the size of the layer in zip? – Marcin Aug 19 '20 at 07:34
  • 2
    @Marcin, you're right the single XGBoost zip is over 300 MB. This approach would directly work with files less than 50 MB and on s3 with files with less than 260 MB. I would edit the answer to consider these cases and the xgboost one. Thanks for clarifying! – Miguel Trejo Aug 19 '20 at 14:39
  • I tried this approach earlier, and even with XGBoost library under 260MB it did not work (I am not sure why). Strange that it would work on an EC2 running amazon linux, but not a docker image running amazon linux. – Alex Aug 19 '20 at 15:29
  • @Alex, yes you're right. Apparently, Sam Cli approach could work if custom docker image is built and retaged as the image that sam build -u pulls. – Miguel Trejo Aug 19 '20 at 16:02
  • @MiguelTrejo did you ever come across the error no module named xgboost.core? I am flagging this as the answer to my question because of the detail and variety of solutions provided. I am not sure if the actual error of no module named xgboost.core ever came up though. – Alex Aug 19 '20 at 17:20
  • @Alex, this error pops out when importing xgboost inside Lambda? could you provide context. – Miguel Trejo Aug 19 '20 at 17:28
  • If I use the zip of Xgboost that you mentioned above from alexeybutyrev as a layer, when I run a test lambda I get the error "cannot import module xgboost.core". I solved this by creating a new zip in an EC2 running amazon linux, but I don't actually know why it cant find the core.py file in the xgboost library. – Alex Aug 19 '20 at 19:16
  • @MiguelTrejo I thought that the layer will be large. But anyway, glad it worked out at the end. Good detailed answer. – Marcin Aug 19 '20 at 21:30
  • Well the A method is not compatible with xgb 1.0.2 & py36 , the size is over 260mb :( – Cristián Vargas Acevedo Jun 03 '21 at 01:51
  • @CristiánVargasAcevedo you can use a docker image, I've provided an example in this [repo](https://github.com/gokavak/lambda-docker-image-pytorch-xgboost) – Miguel Trejo Jun 03 '21 at 14:43
0

So I was never able to figure out why it failed in this way. The solution I found that worked was to create an EC2 instance running amazon linux, install and zip the libraries there, then save to S3. See here for detailed instructions:

https://medium.com/@lucashenriquessilva/how-to-create-a-aws-lambda-python-layer-db2830e08b12

Alex
  • 280
  • 3
  • 18