So I am writing a server in C#. It has a LiteDB database to store messages in. When I start up the server, I also launch an async task to delete expired messages from the DB (messages that are older than some predefined threshold). Lets call this task the GC (garbage collector).
Every time the server receives a message, it stores the message in the DB, and sends a copy to the proper destination.
Since LiteDB is supposed to be thread-safe, I don't use locks to synchronize reads/writes to the DB. The only synchronization that I do is between the GC and regular read/writes to the DB. For that, I use a reader/writer async lock. I treat the GC as a writer, and all other accesses to the DB are considered readers (since supposedly, LiteDB is thread-safe).
I am wondering if I am correct and this design will work?
I've made a test (on a WPF app) that connects 50 clients, each in a separate task. Each client sends a message to all other clients and expects to receive a message from all other clients. I then make sure that all messages were sent/received as expected. During this test, the GC runs every 3 seconds (every time, keeping the writer lock locked for 1 second, using delay). Each message sent also adds 1-4ms delay.
With 50 clients, the server passes the test fine. However, with 100 clients, I observe some weird behavior that makes me reconsider my design.
Between the first 2-3 times of the writerlock being locked, I see many messages pass properly. However, after these 2-3 times, I suddenly see only one message being sent every second. And the GC stops locking the writerlock. Eventually, after 1 minute of this, I get the following exception:
Exception thrown: 'LiteDB.LiteException' in LiteDB.dll
Database lock timeout when entering in transaction mode after 00:01:00
I assume this means that I have some deadlock somewhere. Perhaps too many messages are waiting to be inserted and hold the reader lock?