3

I am creating a unit test for a function that reads objects from S3 buckets using the boto3's s3 client function 'select_object_content'. The response i am looking to mock is

{
    'Payload': EventStream({
        'Records': {
            'Payload': b'bytes'
        },
        'Stats': {
            'Details': {
                'BytesScanned': 123,
                'BytesProcessed': 123,
                'BytesReturned': 123
            }
        },
        'Progress': {
            'Details': {
                'BytesScanned': 123,
                'BytesProcessed': 123,
                'BytesReturned': 123
            }
        },
        'Cont': {},
        'End': {}
    })
}

The Payload is an EventStream object which is created as EventStream(self, raw_stream, output_shape, parser, operation_name) and takes 4 arguments. I have the raw_stream as a byte string encoded with 'utf-8' but I am unable to find more information as to how the other arguments are assigned.

I am using MagicMock to mock the s3_client.select_object_content.

I expect to be able to pass in athena results (which sit in S3 as a CSV) as the stream and make sure the code has unit tests to handle certain scenarios.

Edit: I could Mock the response with the following structure:

The return type of my mock function is Dict[str, Any]

return {'Payload': [{
        'Records': {
            'Payload': b"some utf8 encoded byte stream"
        }},{
        'Records': {
            'Payload': b"some utf8 encoded byte stream"
        }}]}
qazplm
  • 33
  • 7

1 Answers1

1

There's an open bug for this against botocore. Best option at this point is to mock out the entire function and not rely on Stubber:

with patch.object(self.s3_client, 'select_object_content') as mock_select:

    mock_select.return_value = {
        "ResponseMetadata": ANY,
        "Payload": [{
            "Records": {
                "Payload": json.dumps(MANIFEST_DATA).encode()
            },
            "Stats": {},
            "End": {}
        }]
    }


    mock_select.assert_called_once_with(
        Bucket="test-bucket",
        Key=manifest_key,
        Expression=i"Select * from s3Object o",
        ExpressionType="SQL",
        InputSerialization={"JSON": {"Type": "LINES"}},
        OutputSerialization={"JSON": {}}
    )
akarve
  • 1,214
  • 10
  • 11