q agent is learning not to take any actions

Question

I'm training a deep q network to trade stocks; it has two possible actions; 0 : wait, 1 : buy stock if one isn't bought, sell one if one is bought. It gets, as input, the value of the stock it bought, the current value of the stock and the values of the stock for the previous 5 time steps relative to it. So something like

[5.78, 5.93, -0.1, -0.2, -0.4, -0.5, -0.3]

The reward is simply the difference between the price of the sale and the price of the purchase. The reward for any other action is 0, though I've tried having it be negative or something else without results.

simple, right? Unfortunately, the agent always converges on taking the "0" action. Even when I magnify the reward for selling at a profit or any number of things. I'm really pulling my hair out, is there something obvious I've missed?

Could you please include your code in your question? It is difficult to answer in the abstract. — mozart_kv467, May 24 '20 at 15:27
What is the precise definition or your reward function? Also adding some code could help. — a_guest, May 24 '20 at 15:33
Added reward function definition. I could add code, but what pieces of code? There's a good few hundred lines of "relevant" code, I don't want to just copy-paste the whole thing and ask you to figure it out. — RichKat, May 24 '20 at 16:18

score 0 · Answer 1 · answered May 31 '20 at 10:37

0

Although something was probably broken with the agent itself, the second agent I wrote exhibited similar behavior. I finally solved the issue by decreasing the learning rate; in the end it had to be about a thousand times lower than it was

answered May 31 '20 at 10:37

RichKat

57
1
8

q agent is learning not to take any actions

1 Answers1