0

I want to access tfevent file created during training and stored in logs in Azure ML service. This tfevent file can be accessed and shown correctly on normal tensorboard so the file is not broken but when I use Azure ML's tensorboard library to access it, either nothing shows up on local tensorboard or get connection refused.

I first logged it into ./logs/tensorboard like Azure ML has ./logs/azureml but tensorboard launched by Azure ML's module says there is no file to show like this below on the browser.

No dashboards are active for the current data set.
Probable causes:

You haven’t written any data to your event files.
TensorBoard can’t find your event files.
If you’re new to using TensorBoard, and want to find out how to add data and set up your event files, check out the README and perhaps the TensorBoard tutorial.
If you think TensorBoard is configured properly, please see the section of the README devoted to missing data problems and consider filing an issue on GitHub.

Last reload: Wed Aug 21 2019 *****
Data location: /tmp/tmpkfj7gswu

So I thought that saved location would not be recognized by AML and I changed the save location to ./logs then browser shows that "This site can’t be reached. ****** refused to connect."

My Azure ML Python SDK version is 1.0.57

1) How can I fix this?

2) Where should I save tfevent file for AML to recognize it? I couldn't find any information about it in the documentation here. https://learn.microsoft.com/en-us/python/api/azureml-tensorboard/azureml.tensorboard.tensorboard?view=azure-ml-py

This is how I'm launching tensorboard through Azure ML.

if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description=f'This script is to lanuch TensorBoard with '
        f'accessing run history from machine learning '
        f'experiments that output Tensorboard logs')
    parser.add_argument('--experiment-name',
                        dest='experiment_name',
                        type=str,
                        help='experiment name in Azure ML')
    parser.add_argument('--run-id',
                        dest='run_id',
                        type=str,
                        help='The filename of merged json file.')

    args = parser.parse_args()

    logger = get_logger(__name__)
    logger.info(f'SDK Version: {VERSION}')

    workspace = get_workspace()
    experiment_name = args.experiment_name
    run_id = args.run_id
    experiment = get_experiment(experiment_name, workspace, logger)
    run = get_run(experiment, run_id)

    # The Tensorboard constructor takes an array of runs, so pass it in as a single-element array here
    tb = Tensorboard([run])

    # If successful, start() returns a string with the URI of the instance.
    url = tb.start()
    print(url)
kofuji
  • 1

1 Answers1

2

The way that the Tensorboard support in AzureML is designed, is as follows:

  1. You train your model on an AMLCluster or an attached VM and write the Tensorboard log file to the ./logs directory (see here for an example of a script to run).
from azureml.train.dnn import TensorFlow

script_params = {"--log_dir": "./logs"}

# If you want the run to go longer, set --max-steps to a higher number.
# script_params["--max_steps"] = "5000"

tf_estimator = TensorFlow(source_directory=exp_dir,
                          compute_target=attached_dsvm_compute,
                          entry_script='mnist_with_summaries.py',
                          script_params=script_params)

run = exp.submit(tf_estimator)
  1. On your local computer or on your Notebook VM, you start the azureml.tensorboard.Tensorboard instance which will then continuously pull down the logs from the run and write them to local disk. It will also start a Tensorboard instance that you can point your browser to.
tb = Tensorboard(run)
# If successful, start() returns a string with the URI of the instance.
tb.start()

If done on your local machine, the URL will be http://localhost:6000 (or your machines hostname), on the notebook VM, the URL will be of the form https://vmname-6000.westeurope.notebooks.azureml.net/

Here an graph of how a run in AzureML is executed. #6 and #7 are the relevant points here, illustrating how the Tensorboard logs travel from the compute target to the machine that runs the actual Tensorboard. That is "My Computer" in this case, but could also be a NotebookVM. enter image description here

Daniel Schneider
  • 1,797
  • 7
  • 20
  • Is it possible to choose the URL in which Tensorboard is launched ? Azure ML Service gives me http://MacBook-Pro-de-Valentin.local:6006/ and it does not work – Valentin Richer Nov 05 '19 at 21:46