How does `count` aggregation function in Apache AGE work?

Question

This is the data setup from documentation:

SELECT * FROM cypher('graph_name', $$
CREATE (:L {a: 1, b: 2, c: 3}),
       (:L {a: 2, b: 3, c: 1}),
       (:L {a: 3, b: 1, c: 2})
$$) as (a agtype);

and this is the query:

SELECT * FROM cypher('graph_name', $$
    MATCH (x:L)
    RETURN (x.a + x.b + x.c) + count(*) + count(*), x.a + x.b + x.c
$$) as (count agtype, key agtype);

Output:

 count | key
-------+-----
 12    | 6
(1 row)

I don't understand how the count function works exactly, where is the grouping key in this example is it the (x.a + x.b + x.c) part or is it the , x.a + x.b + x.c part, then how does count work to yield the above output?

score 0 · Answer 1 · answered May 05 '23 at 12:36

You can better understand how this query works if you change some values. For example if you create your initial dataset like this:

SELECT * FROM cypher('graph_name', $$
CREATE (:L {a: 1, b: 2, c: 3}),
       (:L {a: 2, b: 3, c: 1}),
       (:L {a: 1, b: 1, c: 1})
$$) as (a agtype);

and then run the same query:

SELECT * FROM cypher('graph_name', $$
    MATCH (x:L)
    RETURN (x.a + x.b + x.c) + count(*) + count(*), x.a + x.b + x.c
$$) as (count agtype, key agtype);

you will get this output:

 count | key 
-------+-----
 5     | 3
 10    | 6

So basically what happened here is that it uses the (x.a + x.b + x.c) as a grouping key. One group is the rows that the (x.a + x.b + x.c) + count(*) + count(*) result into 5. If you see our dataset you can see that one of our vertices has different values that (x.a + x.b + x.c) = 3. Since it is only 1 vertex, when the count(*) function is being used it is only going to count 1 vertex twice (because the count(*) function is used twice). Therefore the count in the output is going to be 5, and the key is just the (x.a + x.b + x.c) which is 3.

The second group is the rows that the (x.a + x.b + x.c) equals to 6. There are 2 rows that satisfy that grouping key so the count(*) function equals to 2 (so the 2 count(*) functions equal to 4). Therefore if we add (x.a + x.b + x.c) + count(*) + count(*) we get 10. And that is the count in our output. For the key it is the same as the first group we just add x.a + x.b + x.c and the resulting key is 6.

score 0 · Answer 2 · edited May 08 '23 at 21:45

0

Refer to Apache AGE official documentation under aggregation functions heading there explained about count() function.

https://age.apache.org/age-manual/master/functions/aggregate_functions.html

edited May 08 '23 at 21:45

Moritz Ringler

9,772
9
21
34

answered May 07 '23 at 10:39

urooj fatima

1
2

score 0 · Answer 3 · answered Jun 30 '23 at 12:42

Here is a breakdown of how the query works:

The MATCH clause matches all nodes of type L.
The RETURN clause returns the count(*) and x.a + x.b + c expressions.
The count(*) expression counts the number of rows that match the MATCH clause.
The x.a + x.b + c expression is the grouping key.
The RETURN clause returns a table with two columns: count and key.

score 0 · Answer 4 · answered Jul 31 '23 at 20:29

So basically what is happening here is that this is a Cypher query that is executed on a graph database from PostgreSQL using the cypher function. The query basically calculates the sum of the properties a, b, and c for each node and counts the nodes with the label L. The output shows the total count of nodes multiplied by 2, which is 12, and the sum of properties for each node, which is 6.

I hope this helps!

score 0 · Answer 5 · answered Aug 18 '23 at 19:57

Actually, the count function count the rows. In this case, it is counting all the rows because there is no grouping.

In the result, it is adding the sum of all the properties (x.a + x.b + x.c) i.e 6 to the total row count twice (3+3), for each node which will be 12.

(x.a + x.b + x.c) + count(*) + count(*)

score 0 · Answer 6 · answered Aug 21 '23 at 11:10

count() returns the number of values or records.

It has two variants:

count(*) returns the number of matching records.
count(expr) returns the number of non-null values returned by an expression.

Here are some considerations for using count():

count(*) includes records returning null.
count(expr) ignores null values.
count(null) returns 0.
Using count(*) to return the number of nodes.
count(*) can be used to return the number of nodes.

It is adding the sum of all the properties (x.a + x.b + x.c).

Reference: Here

score 0 · Answer 7 · answered Aug 21 '23 at 18:37

The count() function in your example is used to count rows or values. It's calculating the sum of properties x.a, x.b, and x.c for each node x. The grouping key is (x.a + x.b + x.c), and it divides the data into buckets based on this key. The count(*) function tallies the number of nodes in each bucket, and since it's used twice, the total node count is multiplied by 2. This sum is then combined with the sum of properties, resulting in the given output.

You can refer to this documentation for more info.

How does `count` aggregation function in Apache AGE work?

7 Answers7