I am currently running a Tensorflow convnet for image recognition and I am considering of buying new GPUs to enable more complex graphs, batch size, and input dimensions. I have read posts like this that do not recommend using AWS GPU instances to train convnets, but more opinions are always welcomed.
I've read Tensorflow's guide 'Training a Model Using Multiple GPU Cards', and it seems that the graph is duplicated across the GPUs. I would like to know is this the only way to use parallel GPUs in Tensorflow convnet?
The reason I am asking this is because if Tensorflow can only duplicate graphs across multiple GPUs, it would mean each GPU must have at least the memory size that my model requires for one batch. (Example if the minimum memory size required is 5GB, two card of 4GB each would not do the job)
Thank you in advance!