I checked out some notable open-source frameworks with SGD implementations - scikit-learn, vowpal-wabbit and tensor-flow.
All of them leave the task of deciding how many iterations to the user! scikit requires the user to specify it explicitly, vowpal assumes by default 1 epoch (pass thru all examples) but allows changing to any number of epochs, and tensor implements just a single step for a single example, leaving the entire iteration loop to the user.
Why is that? The task of deciding on termination isn't trivial at all- should it be decided when the loss doesn't get any better? the average loss for the last N iterations? Should the user use a validation/hold-out examples for measuring loss? Or maybe it's not the loss at all and we should check if the optimized weights aren't changing by much? Should we check after every example for termination, or once in a while?
Would be happy if someone sheds a light on this design decision, am I missing something and it can't be done internally? The theory for this area is heavy and I was hoping for some support from the frameworks..