0

I have not seen anything in the rllib documentation that would allow me to print a quick summary of the model like print(model.summary()) in keras. I tried using tf-slim and

variables = tf.compat.v1.model_variables()
slim.model_analyzer.analyze_vars(variables, print_info=True)

to get a rough idea for tensorflow models, but this found no variables after the model was initialized (inserted at the end of the ESTrainer class _init). Specifically, I have been trying to get a summary of an Evolutionary Strategy (ES) policy to verify that the changes to the model config are being updated as expected, but I have not been able to get a summary print working.

Is there an existing method for this? Is slim expected to work here?

Mandias
  • 742
  • 5
  • 17

1 Answers1

1

The training agent can return the policy which gives you access to the model:

agent = ppo.PPOTrainer(config, env=select_env)

policy = agent.get_policy()
policy.model.base_model.summary() # Prints the model summary

Sample output:

 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 observations (InputLayer)      [(None, 7)]          0           []                               
                                                                                                  
 fc_1 (Dense)                   (None, 256)          2048        ['observations[0][0]']           
                                                                                                  
 fc_value_1 (Dense)             (None, 256)          2048        ['observations[0][0]']           
                                                                                                  
 fc_2 (Dense)                   (None, 256)          65792       ['fc_1[0][0]']                   
                                                                                                  
 fc_value_2 (Dense)             (None, 256)          65792       ['fc_value_1[0][0]']             
                                                                                                  
 fc_out (Dense)                 (None, 5)            1285        ['fc_2[0][0]']                   
                                                                                                  
 value_out (Dense)              (None, 1)            257         ['fc_value_2[0][0]']             
                                                                                                  
==================================================================================================
Total params: 137,222
Trainable params: 137,222
Non-trainable params: 0
  • Thanks for the response. Is this possible if I am using tune? Like so: 'analysis = tune.run(ESTrainer, config=ES_config, stop=stop, checkpoint_freq=5, **extra_kwargs)'. Otherwise I will try to refactor my code to the format you are showing. – Mandias Jan 05 '22 at 00:53
  • 1
    I haven't tried myself but I noticed the first parameter of tune.run can be a function that will let you instantiate a trainer and thus access to the policy: https://github.com/ray-project/ray/issues/8379#issuecomment-626239029 – Francois Beaussier Jan 05 '22 at 11:31
  • I will give that a try, thank you. I accepted your answer since this seems like plenty of info for me to get it working. – Mandias Jan 05 '22 at 19:57