Is cudnn convolution workspace reusable?

Question

I need to find reference or description regarding workspace that is provided to cudnnConvolutionForward, cudnnConvolutionBackwardData, cudnnConvolutionBackwardFilter familiy of functions.

Can I reuse the workspace for next calls/layers assuming that different layers aren't executed in parallel on the GPU?

I'm looking into caffe's implementation of cudnn_conv_layer.cpp and instance of layer allocates its own and separate space for each of 3 functions. Which seems to be wasteful since logically I should be able to reuse the memory for multiple layers/functions.

However I can't find a reference that allows or disallows this explicitly and Caffe keeps separate workspace for each and every layer and I suspect that in total it may "waste" a lot of memory.

Alexander Pivovarov · Accepted Answer · 2021-04-23T16:35:34.873

Yes, you can reuse the workspace for calls from different layers. Workspace is just memory needed by the algorithm to work, not a sort of context that has to be initialized or keeps certain state, you can see it in the cuDNN user guide e.g. here or here (look e.g. for the documentation for cudnnGetConvolutionForwardWorkspaceSize). Also that is why inside one layer the size of workspace is computed as the maximum of all possible workspaces that are needed by any of the algorithms applied (well, multiplied by CUDNN_STREAMS_PER_GROUP and also by number of groups if more than one since groups can be executed in parallel).

That said in caffe it is quite possible for 2 instances of any layer to be computed in parallel and I don't think workspaces are that large compared to the actual data one have to store for one batch (though I'm not sure about this part since that depends on the NN architecture and algorithms used), but I have doubts you can win a lot by reusing the workspace in common cases.

In theory you could always allocate workspace right before corresponding library call and free it after, which would save even more memory, but it would probably degrade the performance to some extent.

I couldn't find the description of re usability in docs... And if I dont put limit on workspace in caffe it cab grow to 1.5G on resnet50 — Artyom, Apr 24 '21 at 19:56
The docs contain the description of how exactly the workspace is being used by different algorithms e.g. for forward convolution see here: https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnConvolutionFwdAlgo_t . You might have already tried this, but you can get some insight into which algos your network is using by setting env vars CUDNN_LOGINFO_DBG to 1 and CUDNN_LOGDEST_DBG to some path before initializing caffe and get the logs that will tell you which algorithms are being picked (see https://docs.nvidia.com/deeplearning/cudnn/developer-guide/index.html#api-logging). — Alexander Pivovarov, Apr 24 '21 at 23:23

Is cudnn convolution workspace reusable?

1 Answers1