0

I am trying to execute a retrained PyTorch FasterRCNN in multiple threads on an Nvidia Jetson Xavier.

The main threads add the image path to a Queue. Four worker threads are doing the following things:

  • loading the image with PIL img = Image.open(imgPath)
  • transform it into a tensor by img = to_tensor(img) from tourchvision.transforms
  • put it to GPU img = img.to(device)
  • execute the RCNN Network pred = model([img])
  • save the results in a normal list resultList.append(pred)
  • delete the variable holding the image with del img

However, the process runs out of memory after around 10.000 images and get killed by the operating system.

I tried to do the following steps after 1000 images:

  • stop all threads
  • do garbage collection by gc.collect()
  • clear GPU Memory by torch.cuda.empty_cache()
  • restart threads

However, as expected, it does not solve the problem.

I know there is the DataLoader of PyTorch to do multithreading. Since I use the RCNN in a larger project I tried it without the DataLoader within the execution task.

I'm pretty sure there is no list that stores images, since then, the memory would run out faster. The results of the network are just bounding boxes. Therefore, they also should not consume so much memory. Additionally, the memory consumption is not growing slow, instead its jumps sometimes by around 1GB.

I hope someone have an idea for solving the problem or how to better debug.

Thanks, Peter

  • Can you add your code? It's possible some variable containing a large amount of data is not going out of scope and therefore Python is not able to reclaim the memory. – Shiva Dec 08 '20 at 14:32
  • Thanks for your replay. I have to write a comparable exmaple, since the coded is embedded and hard to read. However, before I used the multithreading, the code was able to process more than 30.000 images without problems. – PeterAlgoMaker Dec 08 '20 at 14:42
  • Silly suggestion - https://www.google.com/search?q=python+threading+memory+leak Checkout some of those StackOverflow answers. One of them could solve your problem. Without looking at code it's difficult to troubleshoot. – Shiva Dec 08 '20 at 14:47
  • 1
    In the end I really found some regular memory leak, whereby sometimes the images is not released. – PeterAlgoMaker Dec 08 '20 at 20:35

0 Answers0