How to illustrate concurrency benefits using an OpenFaas function?

Question

I am experimenting with OpenFaas and trying to evaluate the performance benefit of having more than workers serve the same function (the default shasum function available from the store). I have a cluster of three 'small'(1vcpu 2gb ram) one 'medium' (1vcpu 4gb ram) and one 'big' (2vcpu 4gb ram) VM's. Scheduling is done with Kubernetes, and the medium and large VM's are exempt from hosting any functions which are hosted on the small VMs. The hey tool is used to perform multiple invocations, and I spawn workers (i.e additional pods, instances of the function) manually through the API. The 8080 port of the gateway component is port-forwarded to localhost (kubectl port-forward -n openfaas svc/gateway 8080:8080 &), and invocations of the function are made using a commandline similar to the following:

hey -n 50 -c 3 -m POST -D 50large.txt http://localhost:8080/function/shasum

or

hey -n 20000 -c 600 -m POST -d test http://localhost:8080/function/shasum

(the first one tests with 50 shasums of a 30Mb file, the other one with 600 concurrent publishers of a small request, 20000 times). The invocations are made from the 'big' VM, which cannot host any function pods (cordoned).

Sometimes, I notice that if I call the function with great amounts of concurrent requests or large file inputs, the gateway fails to forward these requests and port-forwarding is broken (for example, when using the first command, when substituting -c 3 with -c 5, for 5 concurrent producers).

But even when port-forwarding is not broken (i.e using -c 3) I get some not easily explainable results. Consider the execution log below for a case which makes continuous use of three workers (function pods), evenly spread over the three small VMs:

root@big-vm-1:~# hey -n 500 -c 3 -m POST -D 50large.txt http://localhost:8080/function/shasum

Summary:
  Total:        541.0489 secs
  Slowest:      5.5438 secs
  Fastest:      1.1259 secs
  Average:      3.2351 secs
  Requests/sec: 0.9204

And the other execution log, which only uses a single worker (one function pod):

root@big-vm-1:~# hey -n 500 -c 3 -m POST -D 50large.txt http://localhost:8080/function/shasum

Summary:
  Total:        551.3123 secs
  Slowest:      5.1512 secs
  Fastest:      1.4815 secs
  Average:      3.3106 secs
  Requests/sec: 0.9033

Why does using multiple function pods only achieve marginally better results? Can anyone suggest an approach to verify that using multiple workers is actually better than using a single worker, using this or a related setup?

score 0 · Answer 1 · answered Jan 04 '22 at 05:15

Not knowing what do you mean by "worker" it's very hard to guess why the different number of workers doesn't have a lot of impact.

The only mention of "worker" I was able to find in the OpenFaas documentation is:

The queue-worker acts as a subscriber and deserializes the HTTP request and uses it to invoke the function directly

so if this is your "worker" than increasing the number of subscribers shouldn't increase the processing speed and your results are kind of expected.

I noticed you're using localhost, if you have a k8s local installation and running your tests on a single physical (or virtual) machine be informed that it's not the best idea to have the load generator (hey in your case) and the system under test at the same machine due to a possible race condition (which will happen for sure)

Also a good idea is running performance tests against production-like environment (staging) because you cannot extrapolate the results and predict/calculate the saturation/breaking points for different hardware/software, there are some aspects which could be tested on a scaled-down environment, but in general results won't be reliable so consider conducting a test in more realistic conditions and using realistic workload/payload/concurrency/etc.

Thank you for taking the time to answer Dmitri! Workers are function pods. Since function calls are independent, a great reduction could be expected. Localhost is used only so as to forward the load to function pods (kubectl port-forward) which are exclusively scheduled outside the 'big' vm (so fewer race conditions). In fact, it is a realistic staging environment, VMs are public cloud instances. Load Allocation, Soak testing and Application monitoring are some aspects from your last reference which can be tested using this setup... — atsag, Jan 04 '22 at 09:59

score 0 · Answer 2 · answered Mar 08 '22 at 10:14

0

Having a similar Kubernetes cluster to yours and working on something quite relative, i wanted to accelarate functions' execution in a parallel fashion. This can be done with async function invocation and by scaling up the Queue-Worker OpenFaas component which serves the async requests. Scaling up the function's replicas didnt seem to help at all when a function is short-lived.

This github issue helped me a lot.

answered Mar 08 '22 at 10:14

Dimitris Giagkos

1
1

Excuse me, did you increase both replicas and maxInflight value? – Christian Sicari Jun 14 '22 at 15:25
I increased only the *queue_worker* replicas but both increased seems feasible too – Dimitris Giagkos Jun 15 '22 at 16:20

How to illustrate concurrency benefits using an OpenFaas function?

2 Answers2