Question: If the learning rate (a) is too large, what happens to the graph and how could this affect the loss function with iterations
I've read somewhere that the graph may not converge or there could be many fluctuations in the graph, I would just like to be clear on this. I'm unsure as well on how this could affect the loss function when plotted.