0

I am trying to launch an async transcription job inside a lambda. I have a cloudwatch event configured that should trigger on completion of the transcription job; So that I can perform some action on job completion in a different lambda. But the problem is that the async transcription job is lauched successfully with following jobResult in the log but the job never completes and the job completed event is not triggered.

jobResult = java.util.concurrent.CompletableFuture@481a996b[Not completed, 1 dependents]

My code is on following lines -

public class APIGatewayTranscriptHandler implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {

    public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent event, Context context) {
        S3Client s3Client = S3Client.create();
        String fileUrl = s3Client.utilities().getUrl(GetUrlRequest.builder().bucket("srcBucket").key("fileName").build()).toString();
        Media media = Media.builder().mediaFileUri(fileUrl).build();

        StartTranscriptionJobRequest request = StartTranscriptionJobRequest.builder().
                languageCode(LanguageCode.ES_ES)
                .media(media).outputBucketName("destBucket")
                .transcriptionJobName("jobName")
                .mediaFormat("mp3")
                .settings(Settings.builder().showSpeakerLabels(true).maxSpeakerLabels(2).build())
                .build();

        TranscribeAsyncClient transcribeAsyncClient = TranscribeAsyncClient.create();
        CompletableFuture<StartTranscriptionJobResponse> jobResult = transcribeAsyncClient.startTranscriptionJob(request);
        logger.log("jobResult =  " + jobResult.toString());
        
        jobResult.whenComplete((jobResponse, err) -> {
            try {
                if (jobResponse != null) {
                    logger.log("CompletableFuture : response = " + jobResponse.toString());
                } else {
                    logger.log("CompletableFuture : NULL response: error = " + err.getMessage());
                }
            } catch (Exception e) {
                e.printStackTrace();
            }
        });

        //Job is completed only if Thread is made to sleep
        /*try {
                Thread.sleep(50000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }*/

        APIGatewayProxyResponseEvent response = new APIGatewayProxyResponseEvent();
        response.setStatusCode(200);
        Map<String, String> responseBody = new HashMap<String, String>();
        responseBody.put("Status", jobResult.toString());
        String responseBodyString = new JSONObject(responseBody).toJSONString();
        response.setBody(responseBodyString);
        return response;
    }
}

I have verified, the audio file exists in the source bucket.

The above job completes and the job completed event is triggered ONLY if I add some sleep time in the lambda after launching the job.
For example,

Thread.sleep(50000);

Every thing works as expected if sleep time is added. But without Thread.sleep() the job never completes. The Timeout for lambda is configured as 60 seconds. Some help or pointers will be really appreciated.

ivish
  • 572
  • 11
  • 35

1 Answers1

2

You are starting a CompletableFuture, but not waiting for it to complete.

Call get() to wait for it to wait util it completes executing.

        [...]
        logger.log("jobResult =  " + jobResult.toString());
        jobResult.get();

        APIGatewayProxyResponseEvent response = new APIGatewayProxyResponseEvent();
        [...]

This also explains why it works when you do call sleep(), as it gives enough time to the Future to complete.

Even if the call only does an HTTPS request, the lambda will finish sooner (HTTPS connections are expensive to create).

Augusto
  • 28,839
  • 5
  • 58
  • 88
  • Actually I am waiting for the job to complete in the code with jobResult.whenComplete((jobResponse, err)). I have updated my question with additional code. But I am launching an asynchronus job and lambda should not be required to wait for the job to complete. Beside job may take substantial amount of time depending on the size of input audio and lambda may timeout before job completes. AWS lambdas have a maximum timeout of 15 minutes. – ivish Oct 12 '20 at 03:55
  • It's not to wait for the job to complete. It's just to wait for the job to be **submitted**. `whenComplete` also doesn't pause until the job is submitted, but rather enqueues an action for when it completes. You code enqueues the action to send a message to the AWS service, but the lambda then reached the end of the execution without even having time to actually send the request to the server as you are using the async client. Try calling `get()` as I suggested. The lambda will take a fraction of a second longer to execute and everything will be fine. – Augusto Oct 12 '20 at 13:43
  • I will try calling get() and get back with results – ivish Oct 12 '20 at 14:21
  • Actually your suggestion seems to work. Thanks a lot. None of the aws docs or examples mention calling get() after launching the job. I will do some more round of testing just to confirm. Thanks again. – ivish Oct 12 '20 at 14:35
  • 1
    It's not an AWS thing, but a Java feature about how `CompletableFuture` works together with the fact that the JVM terminates before the HTTPS request has time to be sent. Something that might be of use is to understand a bit more how this works on Java. There's an excellent presentation from Angelika Langer that I cannot recommend enough: https://www.youtube.com/watch?v=Q_0_1mKTlnY – Augusto Oct 12 '20 at 19:25
  • Very useful, indeed. – ivish Oct 13 '20 at 02:48