0

I would like to train my model locally using this command:

gcloud ml-engine local train
  --module-name cloud_runner
  --job-dir ./tmp/output

The issue is that it complains that --job-dir: Must be of form gs://bucket/object.

This is a local train so I'm wondering why it wants the output to be a gs storage bucket rather than a local directory.

PseudoAj
  • 5,234
  • 2
  • 17
  • 37
hellowill89
  • 1,538
  • 2
  • 15
  • 26

3 Answers3

3

As explained by other gcloud --job-dir expects the location to be in GCS. To go around that you can pass it as a folder directly to your module.

gcloud ml-engine local train \
    --package-path trainer \
    --module-name trainer.task \
    -- \
    --train-files $TRAIN_FILE \
    --eval-files $EVAL_FILE \
    --job-dir $JOB_DIR \
    --train-steps $TRAIN_STEPS
Mark
  • 414
  • 2
  • 13
1

The --package-path argument to the gcloud command should point to a directory that is a valid Python package, i.e., a directory that contains an init.py file (often an empty file). Note that it should be a local directory, not one on GCS.

The --module argument will be the fully qualified name of a valid Python module within that package. You can organize your directories however you want, but for the sake of consistency, the samples all have a Python package named trainer with the module to be run named task.py.

-- Source

So you need to change this block with valid path:

gcloud ml-engine local train
  --module-name cloud_runner
  --job-dir ./tmp/output

Specifically, your error is due to --job-dir ./tmp/output because it is expecting a path on your gcloud

Community
  • 1
  • 1
PseudoAj
  • 5,234
  • 2
  • 17
  • 37
1

Local training tries to emulate what happens when you run using the Cloud because the point of local training is to detect problems before submitting your job to the service.

Using a local job-dir when using the CMLE service is an error because the output wouldn't persist after the job finishes.

So local training with gcloud also requires that job-dir be a GCS location.

If you want to run locally and not use GCS you can just run your TensorFlow program directly and not use gcloud.

Jeremy Lewi
  • 6,386
  • 6
  • 22
  • 37