How do I return large number of columns from mysql database using AWS Lambda?

Question

A brief background: I have a serverless mysql database connected to AWS RDS which consists of 70 columns and 7000+ rows. Many of the columns defined in this RDS have datatype as VARCHAR(2000), BIGINT.

I'm trying to return all the details in this database using an AWS Lambda api, following is the code snippet of the handler function that I'm using to achieve the same:

def handler(event, context):
    try:

        #Storing data to be returned
        dataReturn=[]



        #Insert Data into Database
        with conn.cursor() as cur:
            cur.execute("SELECT <names of all 70 columns> FROM <table_name>" ) 
            row_headers=[x[0] for x in cur.description] #this will extract row headers
            rv = cur.fetchall()
            json_data=[]
            for result in rv:
                json_data.append(dict(zip(row_headers,result)))
            #print(json.dumps(json_data))
            conn.commit()



        return {
            'statusCode': 200,
            'headers': {
            "Access-Control-Allow-Origin": "*",
            "Access-Control-Allow-Methods": 'GET, POST, PUT, DELETE, OPTIONS'
            },
            'body': json.dumps(json_data, cls=DecimalEncoder)
        }
    except Exception: 
        print(traceback.format_exc())
        return {
            'statusCode': 502,
            'headers': {
            "Access-Control-Allow-Origin": "*",
            "Access-Control-Allow-Methods": 'GET, POST, PUT, DELETE, OPTIONS'
            },
            'body': json.dumps("Server Error Occured", cls=DecimalEncoder)
        }

However, when I run this it gives me a '502 server error' with the following logs:

Traceback (most recent call last):
  File "/var/task/getInstrumentsDetails.py", line 55, in handler
    print(json.dumps(json_data))
  File "/var/lang/lib/python3.7/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/var/lang/lib/python3.7/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/var/lang/lib/python3.7/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/var/lang/lib/python3.7/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Decimal is not JSON serializable

As a sidenote, I am able to return 24 columns from this table using: SELECT <name_of_25_columns> FROM <table_name>, but it fails if I try to display more than 24 columns.

Is this an issue related to the amount of memory that AWS Lambda can read/return from the RDS? Please suggest a suitable fix to this problem.

Thank you

score 1 · Answer 1 · answered Mar 13 '20 at 10:32

I suspect the problem isn't the number of columns, but is instead that after 24 columns, your 25th column is a decimal type - at least, that is what the error message is actually reporting.

Your return function specifies an encoder, but your print statement does not - it is commented out in your code above, but the error message you are showing does not have the encoder:

Traceback (most recent call last):
  File "/var/task/getInstrumentsDetails.py", line 55, in handler
    print(json.dumps(json_data))

Thanks for your inputs, but my 9th column is also decimal, which it is able to display. Also, it shows the same error even if I uncomment the print statement — r42, Mar 14 '20 at 07:12

Ionut Ticus · Answer 2 · 2020-03-16T16:06:17.837

0

I don't see the definition for your DecimalEncoder above but you should be able to use something as simple as:

class DecimalEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, decimal.Decimal):
            return float(obj)
        elif isinstance(obj, bytes):
            if len(obj) == 1:
                return ord(obj)
            else:
                return obj.decode()
        return json.JSONEncoder.default(self, obj)

Note that if you want to print the result of json.dumps you need to pass the custom encoder there as well: print(json.dumps(json_data), cls=DecimalEncoder).

Regarding the memory question: Lambda can use 128 MB to 3008 MB, in 64 MB increments; having 7000+ rows with many varchar columns can result in a considerable amount of memory being used; first make sure your code is correct by using a LIMIT clause in your query and then try to run it for the whole data.

Note: you have a dataReturn variable you don't use and you have a conn.commit() statement that should not be necessary when reading (SELECTing) data.

Update: based on the new data it seems your issue is with byte data not Decimal: TypeError: Object of type bytes is not JSON serializable

For example, column Is_Record_Day_Wrkng is a byte with the value \x00; if these byte values should actually be numbers you can use ord to encode them in JSON but if not you should use something like bytes.decode - see updated encode above.

edited Mar 16 '20 at 16:06

answered Mar 13 '20 at 11:11

Ionut Ticus

2,683
2
17
25

Thanks! I have tried to increase the memory of the lambda and have also tried to display 1 row using LIMIT 1, but it still gives the same error – r42 Mar 14 '20 at 07:19
Also, I have defined the exact same decimal encoder which is being used in my return body – r42 Mar 14 '20 at 07:25
Try printing `json_data` without `json.dumps` just so we can see which types it contains; the first row should suffice. – Ionut Ticus Mar 14 '20 at 09:28
It is displaying all the 70 key value pairs with the following: ```Traceback (most recent call last): File "/var/task/.py", line 106, in handler 'body': json.dumps(json_data, cls=DecimalEncoder)``` – r42 Mar 14 '20 at 09:45
There's no indication of what exception is being thrown there; use something like *pastebin* to paste the complete output. – Ionut Ticus Mar 15 '20 at 15:02
If your question was answered please mark it as [solved](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work) – Ionut Ticus Mar 17 '20 at 22:08

score 0 · Answer 3 · answered Mar 14 '20 at 03:31

0

Lambda functions have a cap on the amount of data they will transfer by default but you can change it in API gateway in your aws console.

answered Mar 14 '20 at 03:31

cd3k

719
5
8

How do I return large number of columns from mysql database using AWS Lambda?

3 Answers3