I am experimenting with OpenFaas and trying to evaluate the performance benefit of having more than workers serve the same function (the default shasum function available from the store). I have a cluster of three 'small'(1vcpu 2gb ram) one 'medium' (1vcpu 4gb ram) and one 'big' (2vcpu 4gb ram) VM's. Scheduling is done with Kubernetes, and the medium and large VM's are exempt from hosting any functions which are hosted on the small VMs. The hey
tool is used to perform multiple invocations, and I spawn workers (i.e additional pods, instances of the function) manually through the API. The 8080 port of the gateway
component is port-forwarded to localhost (kubectl port-forward -n openfaas svc/gateway 8080:8080 &
), and invocations of the function are made using a commandline similar to the following:
hey -n 50 -c 3 -m POST -D 50large.txt http://localhost:8080/function/shasum
or
hey -n 20000 -c 600 -m POST -d test http://localhost:8080/function/shasum
(the first one tests with 50 shasums of a 30Mb file, the other one with 600 concurrent publishers of a small request, 20000 times). The invocations are made from the 'big' VM, which cannot host any function pods (cordoned).
Sometimes, I notice that if I call the function with great amounts of concurrent requests or large file inputs, the gateway fails to forward these requests and port-forwarding is broken (for example, when using the first command, when substituting -c 3 with -c 5, for 5 concurrent producers).
But even when port-forwarding is not broken (i.e using -c 3) I get some not easily explainable results. Consider the execution log below for a case which makes continuous use of three workers (function pods), evenly spread over the three small VMs:
root@big-vm-1:~# hey -n 500 -c 3 -m POST -D 50large.txt http://localhost:8080/function/shasum
Summary:
Total: 541.0489 secs
Slowest: 5.5438 secs
Fastest: 1.1259 secs
Average: 3.2351 secs
Requests/sec: 0.9204
And the other execution log, which only uses a single worker (one function pod):
root@big-vm-1:~# hey -n 500 -c 3 -m POST -D 50large.txt http://localhost:8080/function/shasum
Summary:
Total: 551.3123 secs
Slowest: 5.1512 secs
Fastest: 1.4815 secs
Average: 3.3106 secs
Requests/sec: 0.9033
Why does using multiple function pods only achieve marginally better results? Can anyone suggest an approach to verify that using multiple workers is actually better than using a single worker, using this or a related setup?