Deep learning basic thoughts

Question

I try to understand the basics of deep learning, lastly reading a bit through deeplearning4j. However, I don't really find an answer for: How does the training performance scale with the amount of training data?

Apparently, the cost function always depends on all the training data, since it just sums the squared error per input. Thus, I guess at each optimization step, all datapoints have to be taken into account. I mean deeplearning4j has the dataset iterator and the INDArray, where the data can live anywhere and thus (I think) doesn't limit the amount of training data. Still, doesn't that mean, that the amount of training data is directly related to the calculation time per step within the gradient descend?

I'm voting to close this question as off-topic because it might be better suited for [data science exchange](https://datascience.stackexchange.com/) — o-90, Nov 16 '19 at 17:36

score 1 · Accepted Answer · answered Nov 16 '19 at 16:01

1

DL4J uses iterator. Keras uses generator. Still the same idea - your data comes in batches, and used for SGD. So, minibatches matter, not the the whole amount of data you have.

answered Nov 16 '19 at 16:01

raver119

336
1
5

Does that mean the stochastic gradient descend always only considers a subset (batch) of the whole data? How does the algorithm make sure it fits its weights to all the training data? Is it just repeating it until all batches fit? Are theses the epochs? – PeMa Nov 16 '19 at 16:06
2

That's pretty much the definition of SGD :) https://en.wikipedia.org/wiki/Stochastic_gradient_descent – raver119 Nov 16 '19 at 17:12
1

Iterative algorithm, that rolls though minibatches and adjusts weights iteratively through the epoch – raver119 Nov 16 '19 at 17:12

score 0 · Answer 2 · answered Nov 18 '19 at 11:07

Fundamentally speaking it doesn't (though your mileage may vary). You must research right architecture for your problem. Adding new data records may introduce some new features, which may be hard to capture with your current architecture. I'd safely always question my net's capacity. Retrain your model and check if metrics drop.

Deep learning basic thoughts

2 Answers2