Given the cuda context needs to be synchronized across threads while making NVENC calls, would there be true concurrency in encoding multiple streams using multiple threads [each thread handling a single stream] ?
Wouldn't we be better off doing everything in a single thread - saving syscalls of mutex locks etc?