0

I would like to create a task to have one worker perform labeling of multiple sound sources with AWS Sagemaker ground truth. I created a manifest file as follows, but I cannot specify multiple sound sources with source-ref. How to create a manifest file?

dataset.manifest

 {"source-ref":["s3://sagemaker-sample/audio_01.wav", "s3://sagemaker-sample/audio_02.wav"]}

Error

ClientError: Manifest: s3://sagemaker-sample/dataset.manifest has invalid format at line number 0. Make sure that source or source-ref field contains a string value

shimokin
  • 3
  • 4
  • Your format is different than that in the [docs](https://docs.aws.amazon.com/redshift/latest/dg/loading-data-files-using-manifest.html)? – Marcin May 15 '20 at 00:31
  • 2
    Thanks! However, the following message is displayed. [Error Message] **ClientError: Manifest: s3://sagemaker-sample/dataset_multi.manifest has invalid format at line number 0. Make sure that the record contains either 'source' or 'source-ref' but not both.** [Manifest File] `{"sources":[{"source-ref":"s3://sagemaker-sample/signal_01.wav", "name":"signal_01.wav"}, {"source-ref":"s3://sagemaker-sample/signal_02.wav", "name":"signal_02.wav"}]}` – shimokin May 15 '20 at 00:57

1 Answers1

1

Specifying multiple sources as a list under a single "source-ref" is not supported. For the actual format, please refer to https://docs.aws.amazon.com/sagemaker/latest/dg/sms-input-data-input-manifest.html. Each line represents a reference to a single S3 file as show below. Example:

{"source-ref": "S3 bucket location 1"}
{"source-ref": "S3 bucket location 2"}
   ...
{"source-ref": "S3 bucket location n"} 

For your case, it would be

{"source-ref": "s3://sagemaker-sample/audio_01.wav"}
{"source-ref": "s3://sagemaker-sample/audio_02.wav"}