0

I have an app deployed on a wildfly server on the Jelastic PaaS. This app functions normally with a few users. I'm trying to do some load tests, by using JMeter, in this case calling a REST api 300 times in 1 second.

This leads to around 60% error rate on the requests, all of them being 503 (service temporarily unavailable). I don't know what things I have to tweak in the environment to get rid of those errors. I'm pretty sure it's not my app's fault, since it is not heavy and i get the same results even trying to test the load on the Index page.

The topology of the environment is simply 1 wildfly node (with 20 cloudlets) and a Postgres database with 20 cloudlets. I had fancier topologies, but trying to narrow the problem down I cut the load balancer (NGINX) and the multiple wildfly nodes.

yaexiste2
  • 101
  • 1
  • 7

2 Answers2

2

Requests via the shared load balancer (i.e. when your internet facing node does not have a public IP) face strict QoS limits to protect platform stability. The whole point of the shared load balancer is it's shared by many users, so you can't take 100% of its resources for yourself.

With a public IP, your traffic goes straight from the internet to your node and therefore those QoS limits are not needed or applicable.

As stated in the documentation, you need a public IP for production workloads (a load test should be considered 'production' in this context).

Damien - Layershift
  • 1,508
  • 8
  • 15
  • Ill test this tomorrow. It seems like that is the reason. One more question: if using multiple wildfly nodes, does setting the NGINX ip to public solve this issue in the same way? – yaexiste2 Jul 01 '21 at 07:15
  • 1
    Yes, exactly. Only the internet facing node needs public IP - so if you have an LB that has the public IP and the wildfly nodes do not need. – Damien - Layershift Jul 01 '21 at 07:16
1

I don't know what things I have to tweak in the environment to get rid of those errors

we don't know either and as your question doesn't provide sufficient level of details we can come up only with generic suggestions like:

  1. Check WildFly log for any suspicious entries. HTTP 503 is a server-side error so it should be logged along with the stacktrace which will lead you to the root cause
  2. Check whether Wildfly instance(s) have enough headroom to operate in terms of CPU, RAM, et, it can be done using i.e. JMeter PerfMon Plugin
  3. Check JVM and WildFly specific JMX metrics using JVisualVM or the aforementioned JMeter PerfMon Plugin
  4. Double check Undertow subsystem configuration for any connection/request/rate limiting entries
  5. Use a profiler tool like JProfiler or YourKit to see what are the slowest functions, largest objects, etc.
Dmitri T
  • 159,985
  • 5
  • 83
  • 133
  • Thanks for the answer. There are no entries on the wildfly log, there are plenty of resources and it's not a problem with any of my java objects since tests give the same results whether i call expensive operations or just load the index. It could be something related to undertow though. I'm not sure tho, since multiple wildfly instances did not improve this issue. Ill check those other things just in case as well. – yaexiste2 Jul 01 '21 at 07:10