I am trying to query the frequency of certain attributes in Wikidata, using SPARQL.
For example, to find out what the frequency of different values for gender is, I have the following query:
SELECT ?rid (COUNT(?rid) AS ?count)
WHERE { ?qid wdt:P21 ?rid.
BIND(wd:Q5 AS ?human)
?qid wdt:P31 ?human.
} GROUP BY ?rid
I get the following result:
wd:Q6581097 2752163
wd:Q6581072 562339
wd:Q1052281 223
wd:Q1097630 68
wd:Q2449503 67
wd:Q48270 36
wd:Q44148 8
wd:Q43445 4
t152990852 1
t152990762 1
t152990752 1
t152990635 1
t152775383 1
t152775370 1
t152775368 1
...
I have the following questions regarding this:
- What do those
t152...
values refer to? - How can I ignore the tuples containing
t152...
?
I triedFILTER ( !strstarts(str(?rid), "wd:") )
but it timed out. - How can I count the distinct number of answers?
I triedSELECT (COUNT(DISTINCT ?rid) AS ?count)
with the above query, but again it timed out.