0

I'm trying for some time to solve this issue and it will be nice if anyone had some similar case.I'm building some program on aws lambda and need to use scikit-learn package.

The issue is that this package is very big so its difficult to code while uploading this package. A solution I found was to divide the package to seperate layers and upload them, add these layers to 2 functions(i can only upload 50mb at a time so i have 6 layers i divided between 2 functions) then to create from these 2 functions 2 layers and add them to my main function.

I did so but i cannot manage to work with the package neither with the functions. please see the way i implemented the code ->

from lambdaA_function import lambdaA_handler #from filename import method
from lambdaB_function import lambdaB_handler #from filename import method
import json
from sklearn.feature_extraction.text import TfidfVectorizer

if anyone have other workaround i will be happy to hear

Y.D
  • 95
  • 8

3 Answers3

1

You can use the scikit-learn package in the below link which is already created in AWS Repository.

https://github.com/mthenw/awesome-layers

But beware it is said that it is only supported by Python 3.6, 3.7 and 3.8. If you are using Python 3.11 for example, some errors might occur.

If you are really supposed to use a higher python version then you might try downloading scikit-learn package like below and then you can try to .zip it and create as a layer, but it might still exceed 50 mb. I didn't try.

pip install scikit-learn==<version> --no-cache-dir --no-binary :all:
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Eren Sakarya
  • 150
  • 8
  • 1
    Hi Eren, it did exceed to 100mb. the idea that you gave is optional regarding the AWS repository , maybe i can find some adaptors to overcome the error of the version gap – Y.D Aug 16 '23 at 09:28
1

i create a lambda function that creates zip for layer of package and uploading it to the AWS. this way i managed to use these packages. issue here is only becouse of 5 layers limitation that can be overcome of merging in the layer more then one package.

let me know if you would like to see the code that is doing it, its quiet complex function that installing, zipping and uploading layer.

cheers, Y.D

Y.D
  • 95
  • 8
0

You can also upload the layer zip file into s3 bucket and paste the Amazon S3 link URL to add that as a layer to the lambda function.

Amazon usually suggest to add layer above 10 mb to s3 bucket

  • its possible but why to use s3 when you can directly upload to the layer section. its important to remmber to locate all the package directories and files under python folder and then to zip it otherwise the function doesn't recognize the the package – Y.D Aug 26 '23 at 17:31
  • I get your point, all the package files should be in a proper structure under a folder named python. But what else can you do? When AWS provide an alternate source as s3 for just hosting your layer why not use it? No matter how big it is just pack it in one zip and use it. – sahith palika Aug 27 '23 at 14:45
  • i agree that its a way too, if someone else can comment i think it will be best. i'm creating a layer , if the package is too big , it will divide it to chunks, this way i can upload all of them layers of the package(you have for example scikit-learn that is too big you cannot even upload it to s3 without cutting it to chunks. – Y.D Aug 27 '23 at 16:57