I hope to cache some DataFrames in memory to speed up my program (calculate_df
is slow). My code is like
class Foo:
cache = {}
@classmethod
def get_df(cls, bar):
if bar not in cache:
cls.cache[bar] = cls.calculate_df(bar)
return cls.cache[bar]
@classmethod
def calculate_df(cls, bar):
......
return df
Almost all of the time, the possible values of bar
times the size of df
fit into memory. However, I need to plan for cases that I have too many different bars
and big df
which make my cache
cause memory issues. I hope to check memory usage first before I run cache[bar] = calculate_df(bar)
.
What is the right/best way to do such memory checks?