In a multiple threaded server application, I use the type Client
to represent a client. The nature of Client
is quite mutable: clients send UDP heartbeat messages to keep registered with the server, the message may also contain some realtime data (think of a sensor). I need to keep track of many things such as the timestamp and source address of the last heartbeat, the realtime data, etc. The result is a pretty big structure with many states. Each client has a client ID, and I use a HashMap
wrapped in an MVar
to store the clients, so lookup is easy and fast.
type ID = ByteString
type ClientMap = MVar (HashMap ID Client)
There's a "global" value of ClientMap
which is made available to each thread. It's stored in a ReaderT
transformer along with many other global values.
The Client
by itself is a big immutable structure, using strict fields to prevent from space leaks:
data Client = Client
{
_c_id :: !ID
, _c_timestamp :: !POSIXTime
, _c_addr :: !SockAddr
, _c_load :: !Int
...
}
makeLenses ''Client
Using immutable data structures in a mutable wrapper in a common design pattern in Concurrent Haskell, according to Parallel and Concurrent Programming in Haskell. When a heartbeat message is received, the thread that processes the message would construct a new Client
, lock the MVar
of the HashMap
, insert the Client
into the HashMap
, and put the new HashMap
in the MVar
. The code is basically:
modifyMVar hashmap_mvar (\hm ->
let c = Client id ...
in return $! M.insert id c hm)
This approach works fine, but as the number of clients grows (we now have tens of thousands of clients), several problems emerge:
- The client sends heartbeat messages pretty frequently (around every 30 seconds), resulting in access contention of the
ClientMap
. - Memory consumption of the program seems to be quite high. My understanding is that, updating large immutable structures wrapped in
MVar
frequently will make the garbage collector very busy.
Now, to reduce the contention of the global hashmap_mvar
, I tried to wrap the mutable fields of Client
in an MVar
for each client, such as:
data ClientState = ClientState
{
_c_timestamp :: !POSIXTime
, _c_addr :: !SockAddr
, _c_load :: !Int
...
}
makeLenses ''ClientState
data Client = Client
{
c_id :: !ID
, c_state :: MVar CameraState
}
This seems to reduce the level of contention (because now I only need to update the MVar
in each Client
, the grain is finer), but the memory footprint of the program is still high. I've also tried to UNPACK some of the fields, but that didn't help.
Any suggestions? Will STM solve the contention problem? Should I resort to mutable data structures other than immutable ones wrapped in MVar
?
See also Updating a Big State Fast in Haskell.
Edit:
As Nikita Volkov pointed out, a shared map smells like bad design in a typical TCP-based server-client application. However, in my case, the system is UDP based, meaning there's no such thing as a "connection". The server uses a single thread to receive UDP messages from all the clients, parses them and performs actions accordingly, e.g., updating the client data. Another thread reads the map periodically, checks the timestamp of heartbeats, and deletes those who have not sent heartbeats in the last 5 minutes, say. Seems like a shared map is inevitable? Anyway I understand that using UDP was a poor design choice in the first place, but I would still like to know how can I improve my situation with UDP.