Sagemaker training job fails ""FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/annotations.json'"

Question

When trying to use a Quick Start model in AWS Sagemaker, specifically for Object Detection, all fine tune models fail to train.

I'm attempting to fine tune a SSD Mobilenet V1 FPN 640x640 COCO '17 model.

The annotations and images are accepted, but after initializing the training session, the Training Job is unable to find a specific file: FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/annotations.json.

The S3 directory given follows the template required, using a 1 image example for simplicity:

images/
  abc.png
annotations/
  abc.json

The following stack trace is returned:

We encountered an error while training the model on your data. AlgorithmError: ExecuteUserScriptError:
ExitCode 1
ErrorMessage "FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/annotations.json'
"
Command "/usr/local/bin/python3.9 transfer_learning.py --batch_size 3 --beta_1 0.9 --beta_2 0.999 --early_stopping False --early_stopping_min_delta 0 --early_stopping_patience 5 --epochs 5 --epsilon 1e-7 --initial_accumulator_value 0.1 --learning_rate 0.001 --model-artifact-bucket jumpstart-cache-prod-us-east-1 --model-artifact-key tensorflow-training/train-tensorflow-od1-ssd-mobilenet-v1-fpn-640x640-coco17-tpu-8.tar.gz --momentum 0.9 --optimizer adam --reinitialize_top_layer Auto --rho 0.95 --train_only_top_layer False", exit code: 1

There might be an internal bug where the mapping of input annotations isn't transformed and placed into this directory in the Training Job container?

score 0 · Answer 1 · answered Jan 26 '23 at 14:48

Expected input data in quick start solutions object detection algorithm has a single annotations.json file with annotations for all images.

It should be a dictionary with keys "images" and "annotations". Value for the "images" key should be a list of entries, one for each image of the form {"file_name": image_name, "height": height, "width": width, "id": image_id}. Value of the "annotations" key should be a list of entries, one for each bounding box of the form {"image_id": image_id, "bbox": [xmin, ymin, xmax, ymax], "category_id": bbox_label}.

Directory structure is

   images 
      abc.png
      def.png
   annotations.json

Sagemaker training job fails ""FileNotFoundError: [Errno 2] No such file or directory: '/opt/ml/input/data/training/annotations.json'"

1 Answers1