They aren't.
The documentation for Boto3 doesn't do a very good job of describing the credential chain, but the CLI documentation shows the various sources for credentials (and since the CLI is written in Python, it provides authoritative documentation).
Unlike EC2 and ECS, which retrieve role-based credentials from instance metadata, Lambda is provided with credentials in environment variables. The Lambda runtime sets those environment variables when it starts, and every invocation of that Lambda runtime uses the same values.
Concurrent Lambdas receive separate sets of credentials, just like you would if you made concurrent explicit calls to STS AssumeRole
.
Provisioned concurrency is a little trickier. You might think that the same Lambda runtime lives "forever," but in fact it does not: if you repeatedly invoke a Lambda with provisioned concurrency, you'll see that at some point it creates a new CloudWatch log stream. This is an indication that Lambda has started a new runtime. Lambda will finish initializing the new runtime before it stops sending requests to the old runtime, so you don't get a cold start delay.
Update:
Here's a Python Lambda that demonstrates what I've said above. As part of its initialization code (outside the handler) it records when it was first initialized, and then it reports that timestamp whenever it's invoked. It also logs the current contents of the "AWS" environment variables, so that you can see if any of them change.
import json
import os
from datetime import datetime
print("initializing environment")
init_timestamp = datetime.utcnow()
def lambda_handler(event, context):
print(f"environment was initialized at {init_timestamp.isoformat()}")
print("")
print("**** env ****")
keys = list(os.environ.keys())
keys.sort()
for k in keys:
if k.startswith("AWS_"):
print(f"{k}: {os.environ[k]}")
Configure it for provisioned concurrency, then use this shell command to invoke it every 45 seconds:
while true ; do date ; aws lambda invoke --function-name InvocationExplorer:2 --invocation-type Event --payload '{"foo": "irrelevant"}' /tmp/$$ ; sleep 45 ; done
Leave it running for an hour or more, and you'll get two log streams. The first stream looks like this (showing start and end with several hundred messages omitted):
2021-10-19T16:19:32.699-04:00 initializing environment
2021-10-19T16:30:57.240-04:00 START RequestId: a27f6802-c7e6-4f70-b890-2e0172d46780 Version: 2
2021-10-19T16:30:57.243-04:00 environment was initialized at 2021-10-19T16:19:32.699455
...
2021-10-19T17:07:24.853-04:00 END RequestId: dd9a356f-7928-4bf9-be56-86f4c5e1bb64
2021-10-19T17:07:24.853-04:00 REPORT RequestId: dd9a356f-7928-4bf9-be56-86f4c5e1bb64 Duration: 1.00 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 39 MB
As you can see, the Lambda was initialized at 16:19:32, which was when I enabled provisioned concurrency. The first request was handled at 16:30:57.
But what I want to call out is the last request in this log stream, at 17:07:24, or approximately 48 minutes after the Lambda was initialized.
The second log stream starts like this:
2021-10-19T17:04:08.739-04:00 initializing environment
2021-10-19T17:08:10.276-04:00 START RequestId: 6b15ba7c-91e2-4f91-bb6c-99b9877f1ebf Version: 2
2021-10-19T17:08:10.279-04:00 environment was initialized at 2021-10-19T17:04:08.739398
So as you can see, it was initialized several minutes before the final request in the first stream, yet started handling invocations after the first stream.
This is, of course, not guaranteed behavior. It's how Lambda works today, and may change in the future. But change is unlikely: the current implantation behaves as documented, and any change runs the risk of breaking customer code.