1

I'm trying to catch the data form a S3 object. I'm using a S3 Select feature as below:

boto3 version : 1.7.59

import boto3

s3 = boto3.client('s3')
r = s3.select_object_content(
    Bucket="bucket",
    Key="file.json",
    ExpressionType='SQL',
    Expression="select * from s3object S3Object AS s",
    InputSerialization = {
                            'JSON': {
                            'Type': 'LINES'
                            }
                        },
    OutputSerialization = { 'JSON': { 'RecordDelimiter': ',' } },
)


for event in r['Payload']:
    if 'Records' in event:
        records = event['Records']['Payload'].decode('utf-8')
        print(records)
    elif 'Stats' in event:
        statsDetails = event['Stats']['Details']
        print("Stats details bytesScanned: ")
        print(statsDetails['BytesScanned'])
        print("Stats details bytesProcessed: ")
        print(statsDetails['BytesProcessed'])

After run my code I'm getting the error:

Traceback (most recent call last): File "C:/Users/a_urrego/PycharmProjects/DW_FlightHub/S3Select.py", line 48, in OutputSerialization = { 'JSON': { 'RecordDelimiter': ',' } }, File "C:\Users\a_urrego\AppData\Local\Programs\Python\Python36-32\lib\site-packages\botocore\client.py", line 314, in _api_call return self._make_api_call(operation_name, kwargs) File "C:\Users\a_urrego\AppData\Local\Programs\Python\Python36-32\lib\site-packages\botocore\client.py", line 612, in _make_api_call raise error_class(parsed_response, operation_name) botocore.exceptions.ClientError: An error occurred (ParseUnexpectedToken) when calling the SelectObjectContent operation: Unexpected token found AS:as at line 1, column 33.

Process finished with exit code 1

Andres Urrego Angel
  • 1,842
  • 7
  • 29
  • 55

2 Answers2

2

Looks like the SQL expression you're passing is invalid:

"select * from s3object S3Object AS s"

general SQL syntax will be

"SELECT <columns | *> FROM <table> <alias>"

but it looks like you've duplicated a table name or something there. Upper casing on the SQL statements is optional, but I tend to like it.

I haven't used this feature of boto3, but this seems to be the issue after 3 minutes of googling and reading the error message.

[Edit]

Updated my template above after realizing a typo. Also worth noting that a table alias is unnecessary in this use case as it's a very simple SELECT statement.

BowlingHawk95
  • 1,518
  • 10
  • 15
0

You don't need the AS also, it will raise another error, bellow will be enough:

select * from s3object S3Object
Aramis NSR
  • 1,602
  • 16
  • 26