I am currently doing Andrew NG's ML course. From my calculus knowledge, the first derivative test of a function gives critical points if there are any. And considering the convex nature of Linear / Logistic Regression cost function, it is a given that there will be a global / local optima. If that is the case, rather than going a long route of taking a miniscule baby step at a time to reach the global minimum, why don't we use the first derivative test to get the values of Theta that minimize the cost function J in a single attempt , and have a happy ending?
That being said, I do know that there is a Gradient Descent alternative called Normal Equation that does just that in one successful step unlike the former.
On a second thought, I am thinking if it is mainly because of multiple unknown variables involved in the equation (which is why the Partial Derivative comes into play?) .