We are doing background data processing with Django Celery, taking a CSV file (up to 15MB), converting it into list of dict data (which also includes some Django model objects), and breaking it up into chunks to process in sub tasks:
@task
def main_task(data):
i = 0
for chunk in chunk_up(data):
chunk_id = "chunk_id_{}".format(i)
cache.set(chunk_id, chunk, timeout=FIVE_HOURS)
sub_task.delay(chunk_id)
i += 1
@task
def sub_task(chunk_id):
data_chunk = cache.get(chunk_id)
... # do processing
All tasks run in concurrent processes in the background managed by Celery. We originally used Redis backend but found it would routinely run out of memory during peak load scenarios and high concurrency. So we switched to Django's filebased cache backend. Although that fixed the memory issue, we saw that 20-30% of the cache entries never got written. No error thrown, just silent failure. When we go back and look up the cache from CLI, we see that for e.g. chunk_id_7 and chunk_id_9 would exist, but chunk_id_8 would not. So intermittently, some cache entries are failing to get saved.
We swapped in diskcache backend and are observing the same thing, though cache failures seem to be reduced to 5-10% (very rough estimate).
We noticed that in past there where concurrent process issues with Django filebased cache, but it seems to have been fixed many years ago (we are on v1.11). One comment says that this cache backend is more of a POC, though again not sure if it's changed since then.
Is filebased cache a production-quality caching solution? If yes, what could be causing our write failures? If not, what's a better solution for our use case?