I don't fully understand how automatic differentiation can reduce the time complexity.
I'll assume you understand how backpropagation and/or reverse-mode automatic differentiation works for a scalar function y = f(x), where f : R^n -> R. Suppose computing f takes time O(1).
If you remember from backpropagation, you start by setting dy/dy = 1, then computing dy/dx using the chain rule. What you're doing here is you are computing the product of the vector [1.0] with the Jacobian, which is a row vector, to get the vector-Jacobian product which is a column vector. So you are computing the VJP, your vector is just a length-1 vector. You do this in time O(1).
Now suppose that I'm computing the derivative of a multivariate function y = f(x), where f : R^n -> R^m. Then the Jacobian is indeed m x n, and computing the full Jacobian with reverse mode AD would take time O(m). However, you can compute the vector-Jacobian product in time O(1) with a vector v by setting the values of dy_i/dy_i = v_i, for i={1,...,m}. Then you just backpropagate like normal, and get the VJP in a single backwards pass.