We are using "small" memory-only Aerospike server to store website analytics for last hour. Data size for last hour is about 10 Gb.
We tried to execute some aggregation queries from separate server (Java-based client) on Aerospike, something like this (in LUA):
stream : aggregate( map(), complex_aggregate_function ) : reduce( simple_reduce_function )
According to documentation all aggregation is done on Aerospike nodes (single node in our case), and reduce -- on client.
It turns out that aggregate() function process only small batch of data, i.e. 10-16 records. After that aggregation result is sent to client to be processed by reduce().
Since reduce() operation is executed on client, it means server would send at least 1/16 size of data to client. I.e. hundreds of megabytes for our data. Talk about performance.
Is it possible to change "buffer size" or "queue size" or "whatever size" for records stream aggregation? I.e. is it possible to "tune" Aerospike to call reduce() function only once per each node?