2

I'm trying to launch a custom training job using Vertex AI through XManager. When running Custom jobs with tensorboard enabled I get a tensorboard instance in experiments -> tensorboard instances and a button on the custom job page that says OPEN TENSORBOARD. However, this leads to an empty page that says Not found: TensorboardExperiment.

  • I observed this behaviour when running my own custom job and when running XManager's example cifar10_tensorflow. Note that in both cases the job runs to completion without problems.
  • I can visualise the logs locally via the standard tensorboard package and passing as log_dir the cloud storage directory containing the experiments logs.
  • I can upload experiment logs to Vertex AI tensorboard manually using
tb-gcp-uploader --tensorboard_resource_name \
  TENSORBOARD_INSTANCE_NAME \
  --logdir=LOG_DIR \
  --experiment_name=TB_EXPERIMENT_NAME --one_shot=True

0 Answers0