0

I'm using this approach to retrieve the Wikipedia url for a Wikidata item for multiple languages, using Sparql:

SELECT ?item ?en ?url_en ?es WHERE {
  { ?item wdt:P31 wd:Q6256. }
  UNION
  { ?item wdt:P31 wd:Q1250464. }
  UNION
  { ?item wdt:P31 wd:Q3624078. }
  UNION
  { ?item wdt:P31 wd:Q619610. }
  UNION
  { ?item wdt:P31 wd:Q179164. }
  UNION
  { ?item wdt:P31 wd:Q7270. }
  ?item rdfs:label ?en filter (lang(?en) = "en").
  ?item rdfs:label ?es filter (lang(?es) = "es").
  OPTIONAL {
    ?url_en schema:about ?item .
    ?url_en schema:inLanguage "en" .
    FILTER (SUBSTR(str(?url_en), 1, 25) = "https://en.wikipedia.org/")
  }
  OPTIONAL {
    ?url_es schema:about ?item .
    ?url_es schema:inLanguage "es" .
    FILTER (SUBSTR(str(?url_en), 1, 25) = "https://es.wikipedia.org/")
  }
} LIMIT 1000

I get a limited number of results despite the LIMIT value set, while when retrieving labels only:

 SELECT ?item ?en ?es ?it  WHERE {
  { ?item wdt:P31 wd:Q6256. }
  UNION
  { ?item wdt:P31 wd:Q1250464. }
  UNION
  { ?item wdt:P31 wd:Q3624078. }
  UNION
  { ?item wdt:P31 wd:Q619610. }
  UNION
  { ?item wdt:P31 wd:Q179164. }
  UNION
  { ?item wdt:P31 wd:Q7270. }
  ?item rdfs:label ?en filter (lang(?en) = "en").
  ?item rdfs:label ?es filter (lang(?es) = "es").
  ?item rdfs:label ?it filter (lang(?it) = "it").

} LIMIT 1000

I get more results like the

 OPTIONAL {
    ?url_en schema:about ?item .
    ?url_en schema:inLanguage "en" .
    FILTER (SUBSTR(str(?url_en), 1, 25) = "https://en.wikipedia.org/")
  }

is limiting the results found in some way.

loretoparisi
  • 15,724
  • 11
  • 102
  • 146
  • 1
    you should use partOf relation to filter Wikipedia chapters: `SELECT ?item ?en ?url_en ?es WHERE { VALUES ?type {wd:Q6256 wd:Q1250464 wd:Q3624078 wd:Q619610 wd:Q179164 wd:Q7270} ?item wdt:P31 ?type . ?item rdfs:label ?en filter (lang(?en) = "en"). ?item rdfs:label ?es filter (lang(?es) = "es"). OPTIONAL { ?url_en schema:about ?item . ?url_en schema:isPartOf } OPTIONAL { ?url_es schema:about ?item . ?url_es schema:isPartOf } } LIMIT 1000` – UninformedUser Mar 30 '22 at 11:50
  • @UninformedUser Thank you. I see you are using `VALUES` instead of `UNION`. Is it a better solution? Thanks. Also your solution is more elegant, it seems that still having issues like in this case https://gist.github.com/loretoparisi/478dbaa1b89e53bd0c01472011e2c81f – loretoparisi Mar 30 '22 at 13:44
  • which issue do you mean in the Gist? I mean, technically it's just more languages? Are you still missing data? – UninformedUser Mar 30 '22 at 14:30
  • @UninformedUser yep gist has the link to the wikidata query service. I think that I have found the issue: I set `OPTIONAL` to the each label like `OPTIONAL { ?item rdfs:label ?en filter (lang(?en) = "en"). }` – loretoparisi Mar 30 '22 at 15:38

0 Answers0