0
I'm trying to reproduce the work in the paper Demand Response for Home Energy Management Using Reinforcement Learning and Artificial Neural Network. I want to optimize the power consumption for home appliances. The action space is a different power rating for home appliances. My reward function is = -(power rating *electricity price).
I have trained an RL agent using DQN algorithm on Matlab. I have action space that the agent should select from, but my agent always takes the same action irrespective of state. I have checked my reward function and the algorithm does not select the action with the highest reward. Anyone can think of why is the agent behaving this way?
My code:
enter image description here enter image description here What I'm getting while training:
And my agent always takes the same power rating regardless of the state (electricity price). Why?