3

I'm trying to add mecab library to aws lambda layer but it didn't work.

What I want is to tokenize Japanese and Korean languages. Tokenizing is enough.

Here's what I have done. (I referred to this site: https://towardsdatascience.com/how-to-install-python-packages-for-aws-lambda-layer-74e193c76a91 for installing python packages for aws lambda layers)

  1. AWS EC2 docker installation.

  2. Build docker file

sudo vi Dockerfile

-----------------vi editor------------------
FROM amazonlinux:2.0.20191016.0
RUN yum install -y python37 && \
    yum install -y python3-pip && \
    yum install -y zip && \
    yum clean all
RUN python3.7 -m pip install --upgrade pip && \
    python3.7 -m pip install virtualenv
-----------------vi editor------------------


docker build -t lambdalayer .
  1. Run
docker run -it --name lambdalayer lambdalayer:latest bash

  1. Install python packages
python3.7 -m venv mypackages

source mypackages/bin/activate

pip install mecab-python3 -t ./python
pip install unidic-lite -t ./python
pip install --no-binary :all: mecab-python3 -t ./python
pip install -v python-mecab-ko -t ./python

deactivate
  1. zip file
zip -r python.zip ./python/

docker cp lambdalayer:python.zip /home/ubuntu/
  1. AWS s3 upload
cd /home/ubuntu

aws s3 cp python.zip s3://bukketyounghee
  1. Make a lmabda layer
aws lambda publish-layer-version --layer-name layer-search --compatible-runtimes "python3.7" --content S3Bucket=bukketyounghee,S3Key=python.zip

I don't know what I should do next. It doesn't have to be mecab library but I want to use aws lambda because I want a serverless application. Please help me.

Thanks in advance!

Younghee Kim
  • 53
  • 1
  • 5

1 Answers1

2

You can create a lambda layer using docker as described in the AWS blog.

Thus you can add mecab to your function as follows:

  1. Create empty folder, e.g. mylayer.

  2. Go to the folder and create requirements.txt file with the content of

mecab-python3
unidic-lite
  1. Run the following docker command:

The command will create layer for python3.8:

docker run -v "$PWD":/var/task "lambci/lambda:build-python3.8" /bin/sh -c "pip install -r requirements.txt -t python/lib/python3.8/site-packages/; exit"
  1. Archive the layer as zip:
zip -9 -r mylayer.zip python 
  1. Create lambda layer based on mylayer.zip in the AWS Console. Don't forget to specify Compatible runtime to python3.8.

  2. Add the the layer created in step 5 to your function.

  3. I tested the layer using your code:

import json

import MeCab

def lambda_handler(event, context):
        
    wakati = MeCab.Tagger("-Owakati")

    a = wakati.parse("pythonが大好きです").split()
    return  {
        'statusCode': 200,  
        'body': json.dumps(a)
        }

It works correctly:

{
  "statusCode": 200,
  "body": "[\"python\", \"\\u304c\", \"\\u5927\\u597d\\u304d\", \"\\u3067\\u3059\"]"
}
Marcin
  • 215,873
  • 14
  • 235
  • 294