I'm migrating a service based integration platform from .Net Framework to .Net Core. The original versions of the integration platform have proven very successful and compared to replacing it with a 'off the shelf' integration solution, it has a far better ROI.
So after redeveloping the code, all tests has been working very well and have achieved higher levels of performance with a single IIS server that I could with 2 IIS servers with the original versions.
Except... If I go over ~3 message/sec with multiple clients, I start seeing duplicate GUID key errors when trying to save instrumentation data to my DB. All these errors are generated from the on-ramp service. The on-ramp places the message on a queue. The messages are then consumed by an off-ramp service and sent to the destination (for this load test the destination is a file folder).
Even though the off-ramp is also running on the same server as the on-ramp, we do not see any duplication errors generated by the off-ramp. I suspect this is due to the queue creating a linier process, so only one instance of the off-ramp is running at any time vs the on-ramp that has up to 4 clients firing concurrent messages at it's API.
Initially I thought the issue was caused by a static global variable class I had implemented, crossing process boundaries. But I would expect that the issue would be seen with the off-ramp as well, as the service architecture for both are virtually identical.
Summary of thoughts on issue:
- If it is a pure coding issue, then errors would happen at low messaging rates.
- The error would also be seen on the off-ramp if the GUID duplication was chance.
- The on and off ramps are both running on the same server, but duplication only seen on the on ramp. IE on ramp not impacting the off ramp and visa versa.
- Duplication has to be due to shared memory between concurrently running on-ramp instances, generated by multiple client scenario.
To try and resolve the issue I removed the static global variable class but I'm still seeing the duplication errors.
This issue was never observed in the original IIS implementation (after millions of message processed). I suspect the issue is with process isolation in the IIS hosted Kestrel .Net Core service host. From what I have read there is good isolation between different apps (based on IIS path) but not within the same app. So basically within the same IIS app pool. This could explain why .Net Core does not support multiple app running in the same IIS app pool.
If any one has a good idea how i can achieve process isolation between instances of the same app running in the same IIS app pool I would appreciate your thoughts/suggestions.