4

I'm following along with Google's object detection on a TPU post and have hit a wall when it comes to training.

Looking at the job logs, I can see that ml-engine runs a ton of pip installs for various packages, provisions a TPU, and then submits the following:

Running command: python -m object_detection.model_tpu_main 
--model_dir=gs://{MY_BUCKET}/train --tpu_zone us-central1 
--pipeline_config_path=gs://{MY_BUCKET}/data/pipeline.config 
--job-dir gs://{MY_BUCKET}/train

It then errors with:

message:  "Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/root/.local/lib/python2.7/site-packages/object_detection/model_tpu_main.py", line 30, in <module>
from object_detection import model_lib
File "/root/.local/lib/python2.7/site-packages/object_detection/model_lib.py", line 26, in <module>
from object_detection import eval_util
File "/root/.local/lib/python2.7/site-packages/object_detection/eval_util.py", line 28, in <module>
from object_detection.metrics import coco_evaluation
File "/root/.local/lib/python2.7/site-packages/object_detection/metrics/coco_evaluation.py", line 20, in <module>
from object_detection.metrics import coco_tools
File "/root/.local/lib/python2.7/site-packages/object_detection/metrics/coco_tools.py", line 47, in <module>
from pycocotools import coco
File "/root/.local/lib/python2.7/site-packages/pycocotools/coco.py", 
line 49
import matplotlibnmatplotlib.use('Agg')nimport matplotlib.pyplot as plt
                                ^
SyntaxError: invalid syntax
"   

This is my first time using ml-engine and I'm stuck. I find it odd that the error references python2.7, as I submitted the job from my laptop in a python3.6 environment.

Any ideas on where to go from here or what to do?

liamdalton
  • 239
  • 1
  • 7
Gshock
  • 317
  • 1
  • 4
  • 13

1 Answers1

16

Based on the stack trace, three different lines of code somehow fell on the same line (line 49). I believe I've encountered the same problem recently playing with the new Tensorflow object detection API, and the problem was in models/research/object_detection/dataset_tools/create_pycocotools_package.sh, specifically the following line:

sed "s/import matplotlib\.pyplot as plt/import matplotlib\nmatplotlib\.use\(\'Agg\'\)\nimport matplotlib\.pyplot as plt/g" pycocotools/coco.py > coco.py.updated

The problem for me was that the new-line characters weren't recognized, and I solved it by using literal new lines like the following:

sed "s/import matplotlib\.pyplot as plt/import matplotlib\\ matplotlib\.use\(\'Agg\'\)\\ import matplotlib\.pyplot as plt/g" pycocotools/coco.py > coco.py.updated

Hope this helps.

K.Lee
  • 216
  • 2
  • 6
  • @Gshock @K.Lee - I'm having the same problem but I'm not sure how to edit the file you're referring to since it's running on the Google Cloud Platform instead of locally. How did you edit the `create_pycocotools_package.sh` file? I edited it locally and it doesn't seem to change anything... – jordajm Aug 20 '18 at 21:33
  • Nevermind - I realized I had forgotten to re-bundle the code before re-running the job. Thanks for this answer! +1 – jordajm Aug 20 '18 at 22:06