Usually pre-trained networks like VGG16 / Inception etc. works with low resolution like < 500px.
Is it possible to add a high-resolution convolution layer (or two) before the very first layer of pre-trained VGG16 / Inception to make the network be able to consume high-resolution pictures?
As far as I know - the first layers are hardest to train, it took a lot of data and resources to train it.
I wonder if it will be possible to freeze pre-trained network and train only the newly attached high-resolution layer on an average GPU card and about 3000 examples? Could it be done in a couple of hours?
Also if you know any examples how to use high-resolution images for image classification please share the link.
P.S.
The problem with usual downscaling approach is that in our case the tiny details like tiny cracks or tiny dirt dots are very important and they are lost on lower-resolution images.