Error: "argument --job-dir: expected one argument" while training model using AI Platform on GCP

Question

Running macOS Mojave.

I am following the official getting started documentation to run a model using AI platform.

So far I managed to train my model locally using:

# This is similar to `python -m trainer.task --job-dir local-training-output`
# but it better replicates the AI Platform environment, especially
# for distributed training (not applicable here).
gcloud ai-platform local train \
  --package-path trainer \
  --module-name trainer.task \
  --job-dir local-training-output

I then proceed to train the model using AI platform by going through the following steps:

Setting environment variables export JOB_NAME="my_first_keras_job" and export JOB_DIR="gs://$BUCKET_NAME/keras-job-dir".
Run the following command to package the trainer/ directory:

Command as indicated in docs:

 gcloud ai-platform jobs submit training $JOB_NAME \
   --package-path trainer/ \
   --module-name trainer.task \
   --region $REGION \
   --python-version 3.5 \
   --runtime-version 1.13 \
   --job-dir $JOB_DIR \
   --stream-logs

I get the error:

ERROR: (gcloud.ai-platform.jobs.submit.training) argument --job-dir: expected one argument Usage: gcloud ai-platform jobs submit training JOB [optional flags] [-- USER_ARGS ...] optional flags may be --async | --config | --help | --job-dir | --labels | ...

As far as I understand --job-dir: does indeed have one argument.

I am not sure what I'm doing wrong. I am running the above command from the trainer/ directory as is shown in the documentation. I tried removing all spaces as described here but the error persists.

score 0 · Answer 1 · answered Dec 30 '19 at 22:31

Are you running this command locally? Or on a AI notebook VM in Jupyter? Based on your details I assume youre running it locally, I am not a mac expert, but hopefully this is helpful.

I just worked through the same error on an AI notebook VM and my issue was that even though I assigned it a value in a previous Jupyter cell, the $JOB_NAME variable was passing along an empty string in the gcloud command. Try running the following to make sure your code is actually passing a value for $JOB_DIR when you are making the gcloud ai-platform call.

echo $JOB_DIR

Error: "argument --job-dir: expected one argument" while training model using AI Platform on GCP

1 Answers1