0

I am trying to run a job on Google Cloud ML Engine and can't seem to pass multiple file paths as arguments to the parser. Here is what I am writing in the terminal:

JOB_NAME=my_job_name
BUCKET_NAME=my_bucket_name
OUTPUT_PATH=gs://$BUCKET_NAME/$JOB_NAME
DATA_PATH=gs://$BUCKET_NAME/my_data_directory
REGION=us-east1

gcloud ml-engine jobs submit training $JOB_NAME \
    --job-dir $OUTPUT_PATH \
    --runtime-version 1.2 \
    --module-name trainer.task \
    --package-path trainer/ \
    --region $REGION \
    -- \
    --file-path "${DATA_PATH}/*" \
    --num-epochs 10 

Where my_data_directory contains multiple files I later want to read, the problem is that --file-path contains only ['gs://my_bucket_name/my_data_directory'] and not a list of files in said directory.

How do I fix this?

Many thanks in advance.

Miguel Monteiro
  • 389
  • 1
  • 2
  • 16

1 Answers1

-1

Since the arguments you pass after the -- \ line will be user arguments, how the program handles these arguments will largely depend on the trainer you defined. I would go back and modify the trainer program and make it either treat the directory differently or take multiple paths like this:

gcloud ml-engine jobs submit training $JOB_NAME \
    --job-dir $OUTPUT_PATH \
    --runtime-version 1.2 \
    --module-name trainer.task \
    --package-path trainer/ \
    --region $REGION \
    --scale-tier STANDARD_1 \
    -- \
    --train-files $TRAIN_DATA \
    --eval-files $EVAL_DATA \
    --train-steps 1000 \
    --verbosity DEBUG  \
    --eval-steps 100

Some links that will be helpful for developing your own trainer: [1] [2]

Yanfeng Liu
  • 549
  • 5
  • 10
  • Yes, I ended up passing the directory of the files and then getting the files from within my program. It's just from the sample example, I was led to believe that I could pass multiple file names in one user argument, even though in the sample example they just use one file name. – Miguel Monteiro Jul 14 '17 at 19:53