Node.js is a single-threaded framework (async/await just means deferred execution, not parallel execution). Caliper runs a loop with the following step:
- Waiting for the rate controller to enable the next TX
- Creates an async operation in which the user module will call the blockchain adapter.
All of the pending TXs eat up some CPU time (when not waiting for I/O), plus other operations are also scheduled (like sending updates about TXs to the master process).
To reach 500 TPS, the rate controller must enable a TX every 2ms. That's not a lot of time. Try spawning more than 1 local clients, so the load will be shared among them (100 TPS/client for 5 clients, 50 TPS/client for 10 clients, etc).