0

I am trying to recover the cast list for movies from wikidata. My SPARQL query for Dr. No is as follows:

SELECT ?actor ?actorLabel WHERE {
  ?movie wdt:P161 ?actor .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  FILTER(?movie = wd:Q102754)
}
LIMIT 1000

I can try it out at query.wikidata.org but the results are not in the order that I want. It gives 'Sean Connery', 'Zena Marshall', 'Ursula Andress'.

The database has the data in the required order as you can see from https://www.wikidata.org/wiki/Q102754 includes the cast list in order (Sean Connery, Ursula Andress, Joseph Wiseman). Generally the cast list is given in billing order and it is that that I want to recover.

logi-kal
  • 7,107
  • 6
  • 31
  • 43
SQL Hacks
  • 1,322
  • 1
  • 10
  • 15

1 Answers1

3

SPARQL provides ordering of results by using ORDER BY, see here

The ordering in your example is based on the number of references of a statement. Here is a non-optimized version that does what you want:

SELECT ?actor ?actorLabel WHERE {
  ?movie p:P161 ?statement .
  ?statement ps:P161 ?actor .
  OPTIONAL {?statement prov:wasDerivedFrom ?ref . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  FILTER(?movie = wd:Q102754)
}
group by ?movie ?actor ?actorLabel
ORDER BY DESC(count(?ref)) ASC(?actorLabel)
LIMIT 1000
UninformedUser
  • 8,397
  • 1
  • 14
  • 23
  • Yes, that could get me the actors in alphabetic order - but I want them in order of importance - which is roughly the order they appear in the original data. – SQL Hacks Feb 22 '17 at 15:17
  • How to you know that the results are sorted by "importance"? What kind of measure is this? – UninformedUser Feb 22 '17 at 15:20
  • Admittedly the "importance" is kind of vague - but in Wikipedia the actors are given (more or less) in the order that they appear in the movie credits. Generally the star of the movie comes first, the co-star comes second and so on down to "Man in a hurry at the station". My guess this is something that actors and agents care about a great deal. And it matters in application. – SQL Hacks Feb 22 '17 at 17:07
  • Ordering by the reference count is nice, and a good proxy in lots of cases. But it is not what I am after. Thanks for the suggestion. – SQL Hacks Feb 22 '17 at 17:09
  • @SQLHacks It appears that on the page that you linked to in the question, they're ordered by reference. If that's not what you're looking for, then what *are* you looking for? Generally, RDF triples don't impose an ordering on things. Seeing three triples "x hasActor A . x hasActor B . x hasActor C" is the same as "x hasActor B . x hasActor C . x hasActor A". Unless wikidata has some extra information, you'd have to define your own ordering. – Joshua Taylor Feb 22 '17 at 21:57
  • I feared as much. Wikidata does have the sequence and I can get it from the json but that is a lot of work. Thanks for confirming that there is no ordering on the triples. (The first few cast members are ordered by reference count - but further down the list that's not true). – SQL Hacks Feb 22 '17 at 23:18
  • Then please ask on the Wikidata mailing **how** those things are ordered. Without, I can't give you the right SPARQL query. At least, there is no such measure like PageRank in the Wikidata endpoint. – UninformedUser Feb 23 '17 at 07:37