3

I have a very simple webservice in AWS Lambda using Python and Flask (Service A). The service receive a request and perform a DynamoDB query and returns the results. DynamoDB has on-demand capacity and almost in all cases return 1 result.

I perform the query with the following function.

class DynamoDB:

    def __init__( self ):
        session = boto3.Session( )
        self.dynamodb = session.resource( 'dynamodb' )

    def query( self, table_name, **kwargs ):

        # Selected Table
        table = self.dynamodb.Table( table_name )

        # Request to table
        response = table.query( **kwargs )

        return response

Query Expression

"#user_id = :user_id and begins_with( #sort_key, :sort_key)" 

Response size ~ 400B

I encounter some issues with the performance such as for a single request take 1040ms with AWS Lambda Memory to 128MB and Max Memory Used to 95-100 MB. All the time except of 4ms consumed in the DynamoDB query.

Below are the response times when I increase the memory.

128  MB  -> 1040 ms
512  MB  -> 520  ms
1024 MB  -> 210  ms  

Now I have an another webservice in AWS Lambda (Service B) which is using Python, Flask, Pandas and PyODBC. The service receive a request and perform 2 simple queries to MSSQL server which is not hosted in AWS and return the results. This service has 128MB of Memory and Max Memory Used: 128 MB (consume all the memory). The performance for a single request to this service is 500ms.

Can someone explain me how is that possible ?

Is there any solution in order to make the query in Service A faster ?

NoSQLKnowHow
  • 4,449
  • 23
  • 35
dapo
  • 697
  • 1
  • 12
  • 22
  • [This](https://stackoverflow.com/a/45278624/14843902) thread might be of use. – amitd Jan 29 '21 at 13:37
  • I updated the code. Yes Im using boto3 in order to performe the query. – dapo Jan 29 '21 at 13:44
  • @amitd Thanks for your answer but none of the following cases is the issue ( table scaning, hot partition ). The table is used only by this service and noone is writing to the table during the request. The service is under testing now so it receive request only for me so none of them cant be the answer. – dapo Jan 29 '21 at 13:45
  • Network bandwidth is also something that is scaled based on the memory of the function. How large are the items and can you share more about the query criteria? – Maurice Jan 29 '21 at 13:52
  • 1
    @Maurice Thanks for your answer. I update me question with the info you want. – dapo Jan 29 '21 at 13:59
  • Make sure you instantiate the boto3 library *outside* (above) the handler. Can't see if you are doing that in your code as its not included. – F_SO_K Jan 31 '21 at 08:02
  • @F_SO_K Yeah thats the solution. I moved the ddb = DynamoDB() outside of the handler and I increased the memory to 256MB. As a result I reduce the response to 67ms. – dapo Jan 31 '21 at 09:20

2 Answers2

3

A couple of things that might help you:

  • The amount of RAM you provision not only influences the compute, but also the network throughput of your Lambda function, so depending on your workload this may be a limit.
  • Instantiating boto3 resources and clients is typically relatively expensive in terms of compute, it's definitely worth it to cache these in order to shave of a few milliseconds from your time - on my relatively powerful notebook it takes about 150ms to instantiate the first boto3 client or resource, because on first instantiation it reads and parses some JSON descriptions and builds the whole object hierarchy, which takes a while.
  • You could consider adding the X-Ray SDK to your function and enable X-Ray on it. This will give you more detailed insights into which part of your application and which API call took so long.

Edit

Boy does memory size matter when instantiating boto3 for the first time! I'm putting a blog post together about the methodology, but it seems that it takes a very long time to initialize the first boto3 client/resource after a lambda cold start if the memory parameter is very small.

Graph

Maurice
  • 11,482
  • 2
  • 25
  • 45
2

I moved the ddb = DynamoDB() outside of the handler and I increased the memory of the lambda function to 256MB. As a result I reduce the response to 67ms - 75ms.

dapo
  • 697
  • 1
  • 12
  • 22
  • This is what the docs advice here: https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html#function-code – Joris Kok Aug 16 '21 at 20:25