2

There are two options provided by AWS Transcribe to create a custom vocabulary. For more info Custom Vocabularies

  • Using List
  • Using Table

I can create custom vocabularies in both ways via AWS console but when it comes to AWS Java SDK, I can create it using a list. In the case of "using table" it gives me an error

Failure reason

The vocabulary that you’re trying to create contains invalid characters or incorrectly formatted terms. See the developer guide for more information.

    AmazonTranscribe transcribe = AmazonTranscribeClient.builder().build();
    CreateVocabularyRequest vocabularyRequest = new CreateVocabularyRequest();
    vocabularyRequest.setLanguageCode(LanguageCode.EnUS.toString());
    vocabularyRequest.setPhrases(Arrays.asList("Phrase\tIPA\tSoundsLike\tDisplayAs", "helloooo\t\thello\thailo"));
    vocabularyRequest.setVocabularyName("table-clone");
    CreateVocabularyResult vocabularyResult = transcribe.createVocabulary(vocabularyRequest);

But I can create the same vocab using table (via AWS console) so I don't think that there is an issue with my vocab.

Case 1: Via AWS Console

One more important thing to notice is that when we create vocab using list view, AWS appends an end delimiter (ENDOFDICTIONARYTRANSCRIBE). But it doesn't append this delimiter when we create vocab using table view

Case 2: Via AWS Java SDK

End delimiter is appended at the end of the file in both cases (list and table). I think this can be the issue.

To Sum Up

I want to create custom vocabulary using table via AWS Java SDK. I can create the same via AWS Console but failed to do so via Java SDK.

Community
  • 1
  • 1
Nishant Thapliyal
  • 1,540
  • 17
  • 28
  • In the console, you have to put the file in S3 first, before doing table ; wouldn't surprise me if you need to do the same here – okaram Feb 18 '21 at 20:10

1 Answers1

0

You can create a custom vocabulary using table, by uploading the .txt file to AWS S3 and then using the URI of the object as a value to VocabularyFileUri key.

You can do the same task, by uploading a file in list format on AWS Console, but if you need to use tables, S3 is the way to go!