I found out data augmentation can be done in PyTorch by using torchvision.transforms. I also read that transformations are apllied at each epoch. So I'm wondering whether or not the effect of copying each sample multiple times and then applying random transformation to them is same as using torchvision.transforms on original data set(unique images) and just training it for a longer time(more epochs). Thanks in advance.
-
"random transforms on original data set at each epoch and just running it for more epochs" - very confusing honestly – Shihab Shahriar Khan Mar 03 '19 at 19:11
-
I hope the edit clarifies it. – Farzad Mar 03 '19 at 19:42
-
2if there is any transformation with the word "Random" in it, then no, not the same. – Shihab Shahriar Khan Mar 03 '19 at 19:45
-
Is it possible to predict which one leads to better performance? – Farzad Mar 03 '19 at 19:47
-
2applying at the beginning of every epoch, bcz thats the whole point of "random" transformations i.e. generating diverse samples. – Shihab Shahriar Khan Mar 03 '19 at 19:56
-
It makes sense now. Thanks Shihab Shahriar. – Farzad Mar 03 '19 at 20:02
1 Answers
This is a question to be answered in a broad scale. don't get misunderstood that the TorchVision Transforms doesn't increase your dataset. It applies random or non-random transforms to your current data set at runtime. (hence unique each time and each epoch).
the effect of copying each sample multiple times and then applying random transformation to them is same as using torchvision.transforms on original data set(unique images) and just training it for a longer time(more epochs).
Answer- To increase your dataset, you can copy paste, also use pyTorch or WEKA software. However, more epochs are a totally different concept to this. Of course, the more epochs you use, the better the model will be (only till the validation loss and training loss intersect each other) Hope this helps.

- 11
- 2