0

I have a collection of RDF triples like following.

id#7289587  ex:getInfectedBy    id#7748320
id#7694711  ex:getInfectedBy    id#7748320
id#7748322  ex:getInfectedBy    id#7748320
id#7748887  ex:getInfectedBy    id#7748320

id#7746679  ex:getInfectedBy    id#7748510
id#6434108  ex:getInfectedBy    id#7748510
id#7458397  ex:getInfectedBy    id#7748510

My goal is to count star subgraph pattern of the various node lengths (4,5,6,...,20). For example, I have written following query to find a star subgraph pattern of node length 4 (?s1 ?s2 ?s3 ?o).

SELECT ?o count(distinct ?o)
WHERE
{
  ?s1 ?p ?o.
  ?s2 ?p ?o.
  ?s3 ?p ?o.FILTER((?s1!=?s2) && (?s1!=?s3) && (?s2!=?s3))
} group by ?o

The above mentioned query count star pattern length of node 4 for both nodes id#7748320 and id#7748510. However it suppose to give me the result for only node id#7748510. If I modify the query with 5 node star pattern then nodeid#7748320 shows up there as well. Could you please help me to fix it?

Is it possible to count star subgraph pattern of various node length (4,5,6,...,20) with one query? Please let me know. I appreciate your help.

Beautiful Mind
  • 5,828
  • 4
  • 23
  • 42
  • Simply use `SELECT * WHERE ` with your query to see why this is correct in SPARQL. It's obvious that the data of node `id#7748320` also matches the pattern of the SPARQL query, it's just that you're asking for something that satisfies "at least" that requirement in the query. – UninformedUser May 04 '17 at 06:36

1 Answers1

1

In addition to my comment, I would simply use a different and more efficient query which "counts all incoming nodes per node" and then filter it by using HAVING:

SELECT  ?o (COUNT(DISTINCT ?s) AS ?cnt)
WHERE
  { ?s  ?p  ?node }
GROUP BY ?o
HAVING ( ?cnt = 3 ) # three incoming nodes
UninformedUser
  • 8,397
  • 1
  • 14
  • 23
  • Many thanks for your reply. I think you mean `?node` not `?o`. I have run the query you provided on a virtuoso SPARQL endpoint. It gives me following error `Virtuoso 37000 Error SP031: SPARQL compiler: Variable ?cnt is used in the result set outside aggregate and not mentioned in GROUP BY clause`. Kindly reply me. – Beautiful Mind May 04 '17 at 15:21
  • 1
    It works if I modify `HAVING` clause like following: `HAVING (COUNT(DISTINCT ?s) = 3)`. Thank you very much for your help. – Beautiful Mind May 04 '17 at 15:40
  • Right, I forgot that there is an issue with Virtuoso for some cases where the result of an aggregate is still in the scope but the error is reported anyways. – UninformedUser May 04 '17 at 19:06