I have some high volume datastreams coming in on different websockets (sensor data - several TBs per month), for which
I want to guarantee that all data is stored, even during high load.
So I want to dispatch the data to my database and real-time processing module (e.g. GUI, ML predictions etc), in a way that
it buffers the datastreams, in case the processing of said modules is too slow, and so that these can 'catch-up' when the load decreases.
What I tried: python threads with queues (from Queue module or threading module) but if it's blocking, I can't ensure the data is not congested and if its non blocking (e.g. asyncio.Queue) I get race conditions and things blow up.
So maybe I should use some kind of callback methods but I don't know what to look for. I hope the question is not too vague. If anybody had a pointer to what I could try, optimally using python only, that would really help me a lot, even if its just an idea.