Since there is no explanation in the TF API doc on what collect_policy
really is, I looked into the source code: https://github.com/tensorflow/agents/blob/master/tf_agents/agents/dqn/dqn_agent.py
Can you describe policy and collect policy as exploitation phase policy and exploration phase policy (in terms of classic reinforcement learning without NNs)?
I just need to pin on it something what I know