1

I'm struggling to parse output of ExecuteStatement action using Amazon RDS Data service https://docs.aws.amazon.com/rdsdataservice/latest/APIReference/API_ExecuteStatement.html

Documentation is very poor and I can't find any meaningful examples.

I'm using Aurora MySql DB

Output is returned like this. This example is 2 rows of data returned by SELECT statement, I've replaced actual data with word "data"

 {'ResponseMetadata': {'RequestId': '955a6aee-5bad-4f87-a455-b83a10a8a31b', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '955a6aee-5bad-4f87-a455-b83a10a8a31b', 'content-type': 'application/json', 'content-length': '809', 'date': 'Tue, 02 Jun 2020 05:39:22 GMT'}, 'RetryAttempts': 0}, 'numberOfRecordsUpdated': 0, 'records': [[{'stringValue': 'data'}, {'longValue': data}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': ''}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}], [{'stringValue': 'data'}, {'longValue': data}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': ''}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}, {'stringValue': 'data'}]]}

I need to extract actually only this data fields, so I can use them for INSERT statement in another table.

As per documentation original response is dictionary data type, so I get 3 keys here 'ResponseMetadata', 'numberOfRecordsUpdated' and 'records'. As per documentation 'records' is Type: Array of arrays of Field objects and here is where I stuck. How I can extract Field objects from array of arrays inside dictionary?

Petr Kashlikov
  • 101
  • 1
  • 11

3 Answers3

2

There are many ways, but the most basic nested for-loop in python could suffice:

data_values = []

for record in result['records']:
    print()
    row_data = []
    for data_dict in record:
        #print(data_dict)
        for data_type, data_value in data_dict.items():
            print(data_type, data_value)
            row_data.append(data_value)

    data_values.append(row_data)

print(data_values)  

The code can for a good basis for modifications to suit your needs. It produces the following:

stringValue data
longValue data
stringValue data
stringValue data
stringValue 
stringValue data
stringValue data
stringValue data
stringValue data
stringValue data
stringValue data
stringValue data
stringValue data

stringValue data
longValue data
stringValue data
stringValue data
stringValue 
stringValue data
stringValue data
stringValue data
stringValue data
stringValue data
stringValue data
stringValue data
stringValue data

And data_values:

[['data', 'data', 'data', 'data', '', 'data', 'data', 'data', 'data', 'data', 'data', 'data', 'data'], ['data', 'data', 'data', 'data', '', 'data', 'data', 'data', 'data', 'data', 'data', 'data', 'data']]
Marcin
  • 215,873
  • 14
  • 235
  • 294
  • 1
    Thanks, this is very close to what I need. I used only data_value in output and it is what I need, will just need to transform it to a bit different format so I can use it for another INSERT statement – Petr Kashlikov Jun 02 '20 at 06:52
  • @PetrKashlikov No problem. You can put `data_value` into a list or some other data structure which you can process later on, as in the edited version of the answer. – Marcin Jun 02 '20 at 06:54
  • thanks again. The only minor problem with this approach I spotted, that if I load it into list or tuple it is actually all values are together like [a1,b1,c1,a2,b2,c2], ideally I need to split them as per rows retrieved. Something like this [(a1,b1,c1),(a2,b2,c2)] – Petr Kashlikov Jun 02 '20 at 07:16
  • @PetrKashlikov I modified the answer. – Marcin Jun 02 '20 at 07:19
1

You are getting this

response = 
{'ResponseMetadata': {'HTTPHeaders': {'content-length': '809',
                                      'content-type': 'application/json',
                                      'date': 'Tue, 02 Jun 2020 05:39:22 GMT',
                                      'x-amzn-requestid': '955a6aee-5bad-4f87-a455-b83a10a8a31b'},
                      'HTTPStatusCode': 200,
                      'RequestId': '955a6aee-5bad-4f87-a455-b83a10a8a31b',
                      'RetryAttempts': 0},
 'numberOfRecordsUpdated': 0,
 'records': [[{'stringValue': 'data'},
              {'longValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': ''},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'}],
             [{'stringValue': 'data'},
              {'longValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': ''},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'},
              {'stringValue': 'data'}]]}

You can read yuor rows by iterating response['records']

data_fields = []
for row in response['records']:
    for field in row:
        if field['stringValue']:
           print(field)
           data_fields.append(field)
        elif field['longValue']:
           print(field)
           data_fields.append(field)

Consider to use dictionary cursor by directly accessing database by mysql connection configuration like so you will not see aws request metadata.

import mysql.connector

cnx = mysql.connector.connect(user='username', password='password',
                              host='aws rds mysql host',
                              database='database_name')
cnx.close()

You can see mysql connection details from AWS console.

https://dev.mysql.com/doc/connector-python/en/connector-python-example-connecting.html

Alok
  • 7,734
  • 8
  • 55
  • 100
  • This code produce just full output of 2 rows, I need just list of 'data' fields. Also I'm aware of mysql connector, but have requirement to use boto3 in this case – Petr Kashlikov Jun 02 '20 at 06:40
0

Alternatively if you want to handle your data in a 'dict-like' fashion, if you're using pydantic model parsing for example, you can do something like this

NOTE: you need to add

includeResultMetadata=True

in the execute_statement call for columnMetadata to be present

columns_metadata = result.get('columnMetadata')

object_list = []

for record in result.get('records'):
    row_dict = {}
    for i in range(len(columns_metadata)):
        column = columns_metadata[i].get("name")
        data_dict = record[i]
        for data_type, data_value in data_dict.items():
            if data_type == "isNull" and data_value is True:
                row_dict[column] = None
            else:
                row_dict[column] = data_value

    object_list.append(row_dict)

This will give you a list of all records following this format:

object_list = [
    {
        "col1": "obj1val1",
        "col2": "obj1val2",
        "col3": "obj1val3",
    },
    {
        "col1": "obj2val1",
        "col2": "obj2val2",
        "col3": "obj2val3",
    }
]

NB: This could probably be optimized for performance (currently written for readability)

Shraneid
  • 309
  • 1
  • 2
  • 12