1

I am storing several numpy arrays in a pytables file. Each individual array (size ~1MB - 100MB) fits into RAM but not all (N ~10 - 1000) arrays together fit.

In the application I operate repeatedly on these arrays, also changing their shapes etc. So I want to use pytables to swap currently unneeded arrays to disk and reload them when needed (paging). The swapping is supposed to work on a "Least Recently Used" basis.

How can I tell pytables how much RAM it can use?

I tried playing around with parameters.NODE_CACHE_SLOTS but it had basically no effect. In a test script, I am storing ~200 random arrays of shape (~1000,~1000) in a table. No matter what I chose for some of the parameters like NODE_CACHE_SLOTS, the used RAM stayed the same -- about 80MB, while several GB would be available.

Especially in cases, where all nodes fit into the RAM, programs using pytables would not require any disk I/O and would, hence, of course be much faster. In general, one wants to exploit the available RAM.

[Of course it is also interesting if you know a better option than pytables for such paging purposes.]

0 Answers0