How do I interpret this Load Test result?

Question

We used BlazeMeter to evaluate our site's capability to handle load.

I set up a simple script that logs in (using a special page which ensures each load test user has a different account), visits several pages along a common route, and ends with generating and downloading a PDF report. These reports are generated on-the-fly and the download can take a bit to get going.

The result of a load test with 1,000 concurrent users (reached at 10:13 on the following graph) are as follows:

Graph of load test results

As we expected, there was an increase in response time as the number of users increased and an increase in latency that corresponded. After 10:13, there were consistently 1,000 users walking the script.

What confuses us is the spike in latency (and correspondingly, response time) around 10:25.

We have run this test multiple times, and all the graphs end up looking similar to this. After a few minutes at 1,000 concurrent users there is a period of higher latency and response time, after which the latency drops like a stone and response time stabilizes.

We've discussed this with our hosting service (which keeps us at 4MB usage typically, but will 'burst' us to 100MB during times of high usage) and they are unable to explain it. Our initial thought was that after a few minutes at higher load, the hosting service was automatically doing something to their network to get us higher priority or faster throughput, which resulted in a few minutes of disruption, then increased performance.

Our host, however, claims that this is not the case. They say we always have the 100MB speed, but going higher than a certain threshold is simply a 'billing event', not something that requires their systems to do anything.

What can cause performance like this?

@gtirloni: Doesn't latency typically indicate networking? I mean, we're looking at the server as well, but our focus has been the network since latency is what's spiking. — Jeff, Sep 04 '14 at 14:54
Ok, it seems that is what BlazeMeter is measuring with that (network latency, it could be other type of latency). In that case, request your ISP to provide a graph of your network port utilization and also check traceroute, ping intermediate hosts, etc. Something is reaching a limit. — gtirloni, Sep 04 '14 at 15:08

score 0 · Answer 1 · answered Sep 07 '14 at 07:50

Latency is the delay involved for your request to reach the server.

The response time that is required to receive a response from the server is the sum of the response time + latency. High latency and reasonably low response times indicates that nothing is wrong with application under test but you're hitting some environmental limits.

I would suggest using PerfMon plugin to monitor server side health on various metrics and if everything is fine you can use something like Iperf to see whether your NIC is capable of bearing such a load.

There is a guide on load reports interpretation which can be related to your case.

How do I interpret this Load Test result?

1 Answers1