For stuck example, let our cost function be J(x,y) = x * y and we are currently at point (0,0)
Then the gradient vector will be (0,0). That means we will not move to any other point with the gradient descent algorithm.
For the later question, let's consider another example: the derivative by x of function F(x,y) (let's call it Fx(x,y)) is negative, and the derivative by y of function F(x,y) (let's call it Fy(x,y)) is also negative. Then, what we will do with gradient descent is moving along the vector alpha * (Fx(x,y), Fy(x,y)). How can we guaranteed that F(x + alpha * Fx(x,y),y + alpha * Fy(x,y)) < F(x,y) for any sufficiently small alpha??