Multiprocessing with large ml models

Question

I have got a large transformer model from huggingface. The model is about 2gb in storage. When I try to run multiprocessing processes or pool the program just freezes. Even if I just try to have 2 workers/processes.

From what I understand it freezes because it's trying to pickle the transformer model and copy the environment for both workers.

I've tried to load the model in after the multiprocessing starts but it also results in the same challenge.

My question is do I need to increase my ram if so what's the general rule of thumb for how much ram I need per worker and how would I calculate it.

How can I get this right, I've tried making the model use a shared memory block but I've not managed to get it to work. has anyone done something like this?

score 0 · Answer 1 · answered Aug 12 '22 at 08:30

You probably have to account 2 GB (or more) for each worker, since they likely have different copies of your model.

Using shared memory is the only option if you can't increase your memory amount.

I believe that an easy rule of the thumb to understand how much RAM you need is something n_workers * per_worker_mem * 1.1. You measure per_worker_mem with free or ps command, accounting for a 10% overhead that you may have for synchronization and data exchange between threads. Your overhead may vary according to the amount of data shared and exchanged between the workers.

On a physical system you may also want to account for an additional 1/2 GB for the OS and (in general) a fair amount of free RAM to be used as cache to speedup your file system (e.g. if your model needs 6 GB of RAM, I won't go below 16 or 32 to keep a snappy system).

Multiprocessing with large ml models

1 Answers1