0

Keen to run a library of python code, which uses "RAY", on AWS Lambda / a serverless infrastructure.

Is this possible?

What I am after: - Ability to run python code (with RAY library) on serverless (AWS Lambda), utilising many CPUs/GPUs - Run the code from a local machine IDE (PyCharm) - Have graphics (eg. Matplotlib) display on the local machine / in the local browser

Consideration is that RAY does not run on Windows.

Please let me know if this is doable (and if possible, best approach to set up).

Thank you! CWSE

Anton
  • 3,587
  • 2
  • 12
  • 27
cwse
  • 584
  • 2
  • 10
  • 20

2 Answers2

3

AWS Lambda

AWS Lambda doesn't have GPU support and is tragically suited for distributed training of neural networks. It's maximum run time is 15 minutes, they don't have enough memory to hold dataset (maybe small part of it only).

You may want AWS Lambda for lightweight inference jobs after your neural network/ML model was trained.

As AWS Lambda autoscales it would be well suited for tasks like single image classification and immediate return for multiple users.

Ray

What you should be after for parallel and distributed training are AWS EC2 instances. For deep learning p3 isntances might be a good choice due to Tesla V100 offering. For more CPU heavy load, c5 instances might be a good fit.

When it comes to Ray it indeed doesn't support Windows, but it supports Docker (see installation guide). You may log into container with ray preconfigured after mounting/copying your source code into container with this command:

docker run -t -i ray-project/deploy

and run it from there. For docker installation on Windows see here. It should be doable this way. If not, use some other docker image like ubuntu, setup everything you need (ray and other libraries) and run from within container (or better yet, make the container executable so it outputs to your console as you wanted).

It should be doable this way. If not, you may manually log into small AWS EC2 instance, setup your environment there and run as well.

You may wish to check this friendly introduction to settings and ray documentation to get info how to configure your exact use case.

Szymon Maszke
  • 22,747
  • 4
  • 43
  • 83
-2
import boto3, json

#pass profile to boto3
boto3.setup_default_session(profile_name='default')

lam = boto3.client('lambda', region_name='us-east-1')    
payload = {
        "arg1": "val1",
        "arg2": "val2"
    }

payloadJSON = json.dumps(payload)
lam.invoke(FunctionName='some_lambda', InvocationType='Event', LogType='None', Payload=payloadJSON)

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/lambda.html#Lambda.Client.invoke

If you have a creds file you can cat the ~/.aws/credentials file and you can get your role for the session setup. https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html

ddtraveller
  • 1,029
  • 12
  • 18
  • Thank you, but this doesn't really answer my original questions. Specifically: - can I run RAY python code? - can I return results, charts, etc in browser on my local machine? - can I run from Pycharm/other IDE? - can I utilize (and scale) the use of multiple CPUs/GPUs from my local machine/IDE? – cwse Apr 29 '20 at 09:28
  • You can use layers or just build a lambda with external libs. https://docs.aws.amazon.com/lambda/latest/dg/configuration-layers.html – ddtraveller Apr 29 '20 at 16:14
  • You'll likely need to push the results to an s3 bucket or database or something. I don't know what datatype you're returning. If you want to fire off the lambda from the IDE, use the code above. Push the data to somewhere and then point to that endpoint. – ddtraveller Apr 29 '20 at 16:15
  • If you're after cloud data processing you probably really want kinesis or appsync or something. – ddtraveller Apr 29 '20 at 16:16