I'm trying to access a mp3 file stored in a s3 bucket I own that has Block Public Access enabled. When I upload the mp3 to my source s3 bucket, that triggers my Lambda function that should initialize the Transcribe job. I have 2 issues:
- I do not know if my s3 object URL used for MediaFileUri is correct. I've seen conflicting information
- I don't know if my bucket being private is an issue
Two CloudWatch error messages:
"An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: 1 validation error detected: Value 'source/2004-DNC.mp3' at 'transcriptionJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^[0-9a-zA-Z._-]+"
"An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: The S3 URI that you provided can't be accessed. Make sure that you have read permission and try your request again."
Lambda Function
import boto3
s3 = boto3.client('s3')
transcribe = boto3.client('transcribe')
def lambda_handler(event, context):
for record in event['Records']:
source_bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
object_url = "s3://{0}/{1}".format(source_bucket, key)
response = transcribe.start_transcription_job(
TranscriptionJobName=key,
Media={'MediaFileUri': object_url},
MediaFormat='mp3',
LanguageCode='en-US',
)
print(response)
IAM Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::abcdefghijk-transcribe-source/*"
},
{
"Effect": "Allow",
"Action": [
"transcribe:StartTranscriptionJob"
],
"Resource": "*"
}
]
}