how to add mecab library in aws lambda

Question

I'm trying to add mecab library to aws lambda layer but it didn't work.

What I want is to tokenize Japanese and Korean languages. Tokenizing is enough.

Here's what I have done. (I referred to this site: https://towardsdatascience.com/how-to-install-python-packages-for-aws-lambda-layer-74e193c76a91 for installing python packages for aws lambda layers)

AWS EC2 docker installation.
Build docker file

sudo vi Dockerfile

-----------------vi editor------------------
FROM amazonlinux:2.0.20191016.0
RUN yum install -y python37 && \
    yum install -y python3-pip && \
    yum install -y zip && \
    yum clean all
RUN python3.7 -m pip install --upgrade pip && \
    python3.7 -m pip install virtualenv
-----------------vi editor------------------


docker build -t lambdalayer .

Run

docker run -it --name lambdalayer lambdalayer:latest bash

Install python packages

python3.7 -m venv mypackages

source mypackages/bin/activate

pip install mecab-python3 -t ./python
pip install unidic-lite -t ./python
pip install --no-binary :all: mecab-python3 -t ./python
pip install -v python-mecab-ko -t ./python

deactivate

zip file

zip -r python.zip ./python/

docker cp lambdalayer:python.zip /home/ubuntu/

AWS s3 upload

cd /home/ubuntu

aws s3 cp python.zip s3://bukketyounghee

Make a lmabda layer

aws lambda publish-layer-version --layer-name layer-search --compatible-runtimes "python3.7" --content S3Bucket=bukketyounghee,S3Key=python.zip

I don't know what I should do next. It doesn't have to be mecab library but I want to use aws lambda because I want a serverless application. Please help me.

Thanks in advance!

Glad it worked. If my answer was helpful, it's acceptance would be appreciated. — Marcin, May 15 '21 at 22:17

score 2 · Accepted Answer · answered May 15 '21 at 10:55

You can create a lambda layer using docker as described in the AWS blog.

Thus you can add mecab to your function as follows:

Create empty folder, e.g. mylayer.
Go to the folder and create requirements.txt file with the content of

mecab-python3
unidic-lite

Run the following docker command:

The command will create layer for python3.8:

docker run -v "$PWD":/var/task "lambci/lambda:build-python3.8" /bin/sh -c "pip install -r requirements.txt -t python/lib/python3.8/site-packages/; exit"

Archive the layer as zip:

zip -9 -r mylayer.zip python

Create lambda layer based on mylayer.zip in the AWS Console. Don't forget to specify Compatible runtime to python3.8.
Add the the layer created in step 5 to your function.
I tested the layer using your code:

import json

import MeCab

def lambda_handler(event, context):
        
    wakati = MeCab.Tagger("-Owakati")

    a = wakati.parse("pythonが大好きです").split()
    return  {
        'statusCode': 200,  
        'body': json.dumps(a)
        }

It works correctly:

{
  "statusCode": 200,
  "body": "[\"python\", \"\\u304c\", \"\\u5927\\u597d\\u304d\", \"\\u3067\\u3059\"]"
}

Thank you su much! You saved me! It worked out – Younghee Kim May 15 '21 at 14:02 — Younghee Kim, May 15 '21 at 14:02

how to add mecab library in aws lambda

1 Answers1

Linked