2

I transcoding video on NVIDIA QUADRO K4200 in Ubuntu (ffmpeg version is 2.7.1, NVENC SDK 5.0.1). GPU Memory usage for one stream is 100 MB. Please see an output of nvidia-smi command: enter image description here

But when I run the same transcoding process with the same ffmpeg parameters on another computer with NVIDIA GTX 980 TI (ffmpeg version is 3.0, NVENC SDK 5.0.1) then GPU Memory usage for one stream is 170 MB. Please see the screenshot below: enter image description here

Why such a difference in memory usage? Can I decrease a GPU Memory usage on GTX 980 TI to 100MB for one transcode process as on QUADRO K4000?

Ivan Kolesnikov
  • 1,787
  • 1
  • 29
  • 45

1 Answers1

4

Your answer:

On Quadro and Tesla GPUs, the number of maximum simultaneous NVENC sessions is unlimited, and as such, these platforms will often incur lower driver overheads for the same work unit.

It is also wise to consider that unlike NVCUVENC (with uses your CUDA cores to encode elementary video streams), NVENC is a dedicated hardware-based Silicon Intellectual Property core (SIP) , and if you're comparing across different driver and platform versions, all other factors remaining constant, your mileage will always vary.

Thanks and regards,

Brainiarc7.

Dennis Mungai
  • 1,579
  • 1
  • 15
  • 28
  • 1
    Thank you! A very long time looking for the answer to this question. – Ivan Kolesnikov Apr 18 '16 at 06:02
  • You're welcome. Feel free to ask me anything concerning NVENC. – Dennis Mungai Apr 20 '16 at 10:06
  • I noticed one NVENC process in QUADRO K4200 costs 100MB of video memory, but one NVENC process in GTX 650 costs only 82MB with the same FFmpeg settings. Do you know why such a difference? The number of maximum simultaneous NVENC sessions in GTX 650 is limited and it should incur a large driver overheads for the same work than K4200. – Ivan Kolesnikov Apr 30 '16 at 11:50
  • As stated in the last part, your mileage may vary, mostly because device drivers have their own overheads (even when utilization measurements are done via SMI). Comparing utilization for the same workloads across two different SKUs, all other factors remaining constant, will always show variance due to the aforementioned limitation. – Dennis Mungai May 02 '16 at 18:11
  • And by the way, the Nvidia Quadro K4200 is similar to the Nvidia GeForce GTX 760 in rendering performance (and most likely, NVENC when comparing single session encodes). – Dennis Mungai May 02 '16 at 18:14
  • Thanks a lot for the explanation! – Ivan Kolesnikov May 05 '16 at 09:33