1

I'm quite new to RL and currently teaching myself how to implement different algorithms and hyper-parameters using tf_agents library.

I've been playing around with the code provided from this tutorial https://colab.research.google.com/github/tensorflow/agents/blob/master/docs/tutorials/1_dqn_tutorial.ipynb.

After learning how to use TensorBoard I've come to wonder how I can visualize a graph from tf_agents library. Every TensorBoard tutorials/posts seems to implement its own model or define tf.function to log graph. However, I just can't apply such methods to the tutorial above.

If someone can help me visualize a model graph using tf_agents in TensorBoard, it will be very much appreciated. Thanks!

Kai Yun
  • 97
  • 8

1 Answers1

5

Consider that this colab notebook is a very simple version of how TF-Agents actually works. In reality you should use the Driver to sample trajectories instead of you manually calling

agent.action(state)
env.step(action)

at every iteration. The other advantage of the Driver is that it provides easy compatibility with all the metrics in TF-Agents.

As to your question here is how:

At the beginning of your training define a summary_writer with something like:

train_dir = os.path.join(root_dir, 'train')    
train_summary_writer = tf.summary.create_file_writer(
            train_dir, flush_millis=10000)
train_summary_writer.set_as_default()

Now everytime you call agent.train it will flush to this summary writer and its tensorboard folder train_dir.

To add some metrics into the mix simply define them with something like:

train_metrics = [
        tf_metrics.NumberOfEpisodes(),
        tf_metrics.EnvironmentSteps(),
        tf_metrics.AverageReturnMetric(buffer_size=collect_episodes_per_epoch),
        tf_metrics.AverageEpisodeLengthMetric(buffer_size=collect_episodes_per_epoch),
    ]

Pass them to the Driver as observers together with your Replay Buffer like this:

dynamic_episode_driver.DynamicEpisodeDriver(
            tf_env,
            collect_policy,
            observers=replay_observer + train_metrics,
            num_episodes=collect_episodes_per_epoch).run()

And after this log them to your summaries with:

for train_metric in train_metrics:
    train_metric.tf_summaries(train_step=epoch_counter, step_metrics=train_metrics[:2])

In case you're wondering, the step_metrics arg is to plot the last two metrics against the first two.