I recently started working on a server application designed with the familiar Master-Worker pattern with threads, where one privileged thread manages several worker threads. I have now realized how troublesome threads truly are.
I am now considering the possibility of moving to processes instead of threads, because they solve a lot of the issues I'm experiencing.
However, performance is a major concern, which I fear will decline as the memory usage rises due to duplicated data (lookup tables, context data, etc.) contending for space in the L2/L3 caches. This data needs to be occassionally modified, and may grow quite large.
hash_table files;
function serve_file(connection, path)
file = hash_table[path]
sendfile(connection.fd, file.fd, 0, file.size);
function on_file_added_to_server_root(which)
files.add(which, ...)
Given N
worker processes, it'd be a shame if there were N
copies of this table. However, some tables would be perfect to have duplicated everywhere. But then there's also numerous malloc(3)
-allocated memory that could potentially be shared, but may be scattered all over, causing random pages to be duplicated due to copy-on-write.
Are there any tricks or general strategies to keep memory usage tight in multi-process designs?
Thanks!