How to diagnose inconsistent S3 permission errors

Question

I'm running a Python script in an AWS Lambda function. It is triggered by SQS messages that tell the script certain objects to load from an S3 bucket for further processing.

The permissions seem to be set up correctly, with a bucket policy that allows the Lambda's execution role to do any action on any object in the bucket. And the Lambda can access everything most of the time. The objects are being loaded via pandas and s3fs: pandas.read_csv(f's3://{s3_bucket}/{object_key}').

However, when a new object is uploaded to the S3 bucket, the Lambda can't access it at first. The botocore SDK throws An error occurred (403) when calling the HeadObject operation: Forbidden when trying to access the object. Repeated invocations (even 50+) of the Lambda over several minutes (via SQS) give the same error. However, when invoking the Lambda with a different SQS message (that loads different objects from S3), and then re-invoking with the original message, the Lambda can suddenly access the S3 object (that previously failed every time). All subsequent attempts to access this object from the Lambda then succeed.

I'm at a loss for what could cause this. This repeatable 3-step process (1) fail on newly-uploaded object, 2) run with other objects 3) succeed on the original objects) can happen all on one Lambda container (they're all in one CloudWatch log stream, which seems to correlate with Lambda containers). So, it doesn't seem to be from needing a fresh Lambda container/instance.

Thoughts or ideas on how to further debug this?

I'm not sure what is causing this issue, but I would recommend a different way to setup the access permissions. Instead of using a Bucket Policy to grant access to the IAM Role, you should grant S3 permissions **directly in the IAM Role that is assigned to the AWS Lambda function**. A bucket policy should _not_ be involved in this process. Bucket Policies are normally used to grant public or wide-ranging access, rather than access to a specific user/role. — John Rotenstein, Jul 04 '20 at 03:54
@JohnRotenstein thanks, this is helpful to know. I've checked the IAM role (turns out it already had S3 access) and removed the bucket policy. The behavior is the same as before (including the inconsistent forbidden errors), but this simplifies my setup. — penguinrob, Jul 04 '20 at 14:55
It sounds a bit like [eventual consistency](https://docs.aws.amazon.com/redshift/latest/dg/managing-data-consistency.html) issues, but that should not happen on new objects. Can you show us some minimal code so that we can try to reproduce the issue? — John Rotenstein, Jul 04 '20 at 23:38
In trying to create a minimal reproducible example, when I use `boto3` directly, I don't get the issue. The main codebase is using `s3fs` to fetch the objects, so that might be the issue — penguinrob, Jul 05 '20 at 04:01
Oh! You didn't mention that you are using `s3fs`. Amazon S3 is an object storage system, not a filesystem. It is not recommended to use tools like `s3fs` to 'mount' S3 as a filesystem, especially for Production use. It is better to call the AWS API directly. — John Rotenstein, Jul 05 '20 at 06:04
Ok. We were using it with `pandas` integration, but we didn't need any of the fs-like functionality. Switching to `boto3` directly seems to have resolved the issue. I'll update the question to mention s3fs (when I posted, I just saw the error coming from botocore and didn't think anything of s3fs), and if you want to put your comment in an answer I'm happy to accept it. Thanks for your help with this! — penguinrob, Jul 06 '20 at 15:30

score 2 · Accepted Answer · answered Jul 06 '20 at 21:29

Amazon S3 is an object storage system, not a filesystem. It is accessible via API calls that perform actions like GetObject, PutObject and ListBucket.

Utilities like s3fs allow an Amazon S3 bucket to be 'mounted' as a file system. However, behind the scenes s3fs makes normal API calls like any other program would.

This can sometimes (often?) lead to problems, especially where files are being quickly created, updated and deleted. It can take some time for s3fs to update S3 to match what is expected from a local filesystem.

Therefore, it is not recommended to use tools like s3fs to 'mount' S3 as a filesystem, especially for Production use. It is better to call the AWS API directly.

How to diagnose inconsistent S3 permission errors

1 Answers1