I am currently training several hundred different permutations of neural networks. Using Levenberg-Marquardt backpropogation yields results relatively fast, however I prefer if I use gradient descent for now for academic reasons. Unfortunately, gradient descent is very slow to the point that I simply stop it because it will take too long to train all of the networks.
Are there ways to speed up the gradient descent process, preferably not involving parallel computing techniques?