3

I have a high intensity model that is written in python, with array calculations involving over 200,000 cells for over 4000 time steps. There are two arrays, one a fine grid array, one a coarser grid mesh, Information from the fine grid array is used to inform the characteristics of the coarse grid mesh. When the program is run, it only uses 1% of the cpu but maxes out the ram (8GB). It takes days to run. What would be the best way to start to solve this problem? Would GPU processing be a good idea or do I need to find a way to offload some of the completed calculations to the HDD?

I am just trying to find avenues of thought to move towards a solution. Is my model just pulling too much data into the ram, resulting in slow calculations?

  • Most likely your primary data structure is set up in an inefficient manner. As an example, if you only need one bit for each cell then you could represent 32 cells in each 32 bit word instead of one cell in each 32 bit word. To me it sound like you are more than likely using storing multiple copies of the data than you realize. Also are you possibly using something like a dictionary to store the data in instead of a list? – Itsme2003 Jul 01 '18 at 06:52

2 Answers2

1

Sounds like your problem is memory management. You're likely writing to your swap file, which would drastically slow down your processing. GPU wouldn't help you with this, as you said you're maxing out your RAM, not your processing (CPU). You probably need to rewrite your algorithm or use different datatypes, but you haven't shared your code, so it's hard to diagnose just based on what you've written. I hope this is enough information to get you heading in the right direction.

Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
  • I agree: we would need to see some code or at least some pseudocode. Are you using python lists to hold your 2-dimensional (?) arrays? Or some other data structure? Are you keeping multiple copies of the array around? How much memory are you **expecting** it to take? 200,000 x 50 bytes = 8GB? Did you write the code, or did you get it from somewhere? Maybe it's a library? What do your colleagues use for this kind of problem? – Flortify May 13 '14 at 03:09
  • Can you try it on a smaller input data? Does the memory consumption scale linearly with the data size? Does an input model a tenth of that size result in a tenth of that much memory consumption? (800MB instead of 8GB) – Flortify May 13 '14 at 03:15
1

If the grid is sparsely populated, you might be better off tracking just the populated parts, using a different data structure, rather than a giant python list (array).

Flortify
  • 619
  • 5
  • 15