Ok, so the problem is that I need to do aggregation queries on aerospike's aql console. Specifically, I would like to take an average of a bins of records in a set and to count all the records in a set. I am not sure how to even begin...
2 Answers
aql> SHOW SETS
will give you the numbers of objects in yours sets, with the column n_objects
Then, you use the n_objects
values to calculate your average

- 989
- 2
- 15
- 26
SQL-like aggregation functions are implemented in Aerospike using stream UDFs, which are written in Lua. A stream UDF is a map-reduce operation that is applied on a stream of records returned from a scan or secondary index query.
The stream UDF module (let's assume it's contained in the file aggr_funcs.lua
) would implement COUNT(*) by returning 1
for each record it sees, and reducing to an aggregated integer value.
local function one(record)
return 1
end
local function sum(v1, v2)
return v1 + v2
end
function count_star(stream)
return stream : map(one) : reduce(sum)
end
You would register the UDF module with the server, then invoke it. Here's an example of how you'd do that in Python using aerospike.Query.apply
:
import aerospike
from aerospike import predicates as p
config = {'hosts': [('127.0.0.1', 3000)],
'lua': {'system_path':'/usr/local/aerospike/lua/',
'user_path':'/usr/local/aerospike/usr-lua/'}}
client = aerospike.client(config).connect()
query = client.query('test', 'demo')
#query.where(p.between('my_val', 1, 9)) optionally use a WHERE predicate
query.apply('aggr_funcs', 'count_star')
num_records = query.results()
client.close()
However, you should get metrics such as the number of objects using an info command. Aerospike has an info subsystem that is used by the command line tools such as asinfo, the AMC dashboard, and the info methods of the language clients.
To get the number of objects in the cluster:
asinfo -h 33.33.33.91 -v 'objects'
23773
You can also get the number of objects in a specific namespace. I have a two node cluster, and I'll query each one:
asinfo -h 33.33.33.91 -v 'namespace/test'
type=device;objects=23773;sub-objects=0;master-objects=12274;master-sub-objects=0;prole-objects=11499;prole-sub-objects=0;expired-objects=0;evicted-objects=0;set-deleted-objects=0;nsup-cycle-duration=0;nsup-cycle-sleep-pct=0;used-bytes-memory=2139672;data-used-bytes-memory=618200;index-used-bytes-memory=1521472;sindex-used-bytes-memory=0;free-pct-memory=99;max-void-time=202176396;non-expirable-objects=0;current-time=201744558;stop-writes=false;hwm-breached=false;available-bin-names=32765;used-bytes-disk=6085888;free-pct-disk=99;available_pct=99;memory-size=2147483648;high-water-disk-pct=50;high-water-memory-pct=60;evict-tenths-pct=5;evict-hist-buckets=10000;stop-writes-pct=90;cold-start-evict-ttl=4294967295;repl-factor=2;default-ttl=432000;max-ttl=0;conflict-resolution-policy=generation;single-bin=false;ldt-enabled=false;ldt-page-size=8192;enable-xdr=false;sets-enable-xdr=true;ns-forward-xdr-writes=false;allow-nonxdr-writes=true;allow-xdr-writes=true;disallow-null-setname=false;total-bytes-memory=2147483648;read-consistency-level-override=off;write-commit-level-override=off;migrate-order=5;migrate-sleep=1;migrate-tx-partitions-initial=4096;migrate-tx-partitions-remaining=0;migrate-rx-partitions-initial=4096;migrate-rx-partitions-remaining=0;migrate-tx-partitions-imbalance=0;total-bytes-disk=8589934592;defrag-lwm-pct=50;defrag-queue-min=0;defrag-sleep=1000;defrag-startup-minimum=10;flush-max-ms=1000;fsync-max-sec=0;max-write-cache=67108864;min-avail-pct=5;post-write-queue=0;data-in-memory=true;file=/opt/aerospike/data/test.dat;filesize=8589934592;writethreads=1;writecache=67108864;obj-size-hist-max=100
asinfo -h 33.33.33.92 -v 'namespace/test'
type=device;objects=23773;sub-objects=0;master-objects=11499;master-sub-objects=0;prole-objects=12274;prole-sub-objects=0;expired-objects=0;evicted-objects=0;set-deleted-objects=0;nsup-cycle-duration=0;nsup-cycle-sleep-pct=0;used-bytes-memory=2139672;data-used-bytes-memory=618200;index-used-bytes-memory=1521472;sindex-used-bytes-memory=0;free-pct-memory=99;max-void-time=202176396;non-expirable-objects=0;current-time=201744578;stop-writes=false;hwm-breached=false;available-bin-names=32765;used-bytes-disk=6085888;free-pct-disk=99;available_pct=99;memory-size=2147483648;high-water-disk-pct=50;high-water-memory-pct=60;evict-tenths-pct=5;evict-hist-buckets=10000;stop-writes-pct=90;cold-start-evict-ttl=4294967295;repl-factor=2;default-ttl=432000;max-ttl=0;conflict-resolution-policy=generation;single-bin=false;ldt-enabled=false;ldt-page-size=8192;enable-xdr=false;sets-enable-xdr=true;ns-forward-xdr-writes=false;allow-nonxdr-writes=true;allow-xdr-writes=true;disallow-null-setname=false;total-bytes-memory=2147483648;read-consistency-level-override=off;write-commit-level-override=off;migrate-order=5;migrate-sleep=1;migrate-tx-partitions-initial=4096;migrate-tx-partitions-remaining=0;migrate-rx-partitions-initial=4096;migrate-rx-partitions-remaining=0;migrate-tx-partitions-imbalance=0;total-bytes-disk=8589934592;defrag-lwm-pct=50;defrag-queue-min=0;defrag-sleep=1000;defrag-startup-minimum=10;flush-max-ms=1000;fsync-max-sec=0;max-write-cache=67108864;min-avail-pct=5;post-write-queue=0;data-in-memory=true;file=/opt/aerospike/data/test.dat;filesize=8589934592;writethreads=1;writecache=67108864;obj-size-hist-max=100
Notice that the value of master-objects
on each of the nodes adds up together to the cluster-wide objects
value.
To get the number of objects in a set:
asinfo -h 33.33.33.91 -v 'sets/test/demo'
n_objects=23771:n-bytes-memory=618046:stop-writes-count=0:set-enable-xdr=use-default:disable-eviction=false:set-delete=false;

- 6,951
- 22
- 41
-
Good answer. Thanks. Aerospike's docs can be rather opaque, and you description is clear. – Eric M Dec 11 '20 at 21:49
-
Since I last answered that there’s been work on a Presto connector for Aerospike. Both Presto and Spark with the connectors can do very expressive SQL with predicates pushed down to Aerospike so the compute is split between them – Ronen Botzer Dec 12 '20 at 04:10