I have integrated Apache Geode into a web application to store HTTP session data in it. This web application is run load-balanced, i.e. there are multiple instances of it sharing session data. Each web application instance has its own locale Geode cache (locator and server) and the data is distributed by use of a replicated region to other Geode nodes in the cluster. All instances are in the same network, no multi-site usage. The number of GET operations per second are around 5000 per second; the number of PUT operations are approximatley half of it.
Testing this setup with only one web application instance the performarnce is very promising (in the area of 20-30 ms). However, when adding an instance there is a significatn performance drop up to a few seconds.
It has shown that disabling TCP syn cookies lead to an improvement of processing time up to 50%. Though the performance is still not acceptable.
I ask myself how an eventual bottleneck (e.g. by the communication between Geode nodes) could be identified? Mainly I think of getting out metrics/statistics from Geode, although I could not find anything helpful yet in that regard. I'd appreciate any hint on how to investigate and eliminate performance problems with Apache Geode.