4

I'd like to ask one tricky thing about label. Using SERVICE keyword like SERVICE wikibase:label { bd:serviceParam wikibase:language "ko,en". } enable us to switch language label when the first preference is not mached to the target entity label. However, I want to drop out some entities that does not have any label. However, the service keyword add entity with Qxxxx label when the entity does not have any language match label. How could I remove the entity from the result? I know we can filter that out using rdfs:label for the all the variables explicitly but setting all the rdfs:label to all the variables is another headeache. So I'd like to know how to improve the query with SERVICE wikibase:label I want to filter out entity that doesn't have any label. Should I replace SERVICE with rdfs:label?

    SELECT DISTINCT ?vLabel 
    WHERE { 
    hint:Query hint:optimizer "None" .
    {
        SELECT DISTINCT ?i {
            ?i wdt:P31 wd:Q515.
        }LIMIT 15
    }
        ?v wdt:P937 ?i.
      SERVICE wikibase:label { bd:serviceParam wikibase:language "ko,en". } 
    }
    LIMIT 3
RESULT:
Q59780594 <- no lang label
Q24642253 <- no lang label
Keanu Paik
  • 304
  • 2
  • 12

1 Answers1

3

The Wikidata label service doesn't provide a built-in way to skip resources that don't have a label.

The simplest option would be to wrap the query as a subquery into a new SELECT query, and use a filter to remove any Qxxxx labels. This uses the fact that only the real labels have a language tag:

SELECT ?vLabel {
  {
    SELECT DISTINCT ?vLabel
    ...
  }
  FILTER lang(?vLabel)
}

Edit: Below is my original (and inferior) answer, which used a regular expression on the label itself to remove the Qxxxx ones. It would also filter out any resources that actually have a label of the form Qxxxx, if such resources exist in Wikidata.

SELECT ?vLabel {
  {
    SELECT DISTINCT ?vLabel
    ...
  }
  FILTER (!REGEX(?vLabel, "^Q[0-9]+$"))
}
cygri
  • 9,412
  • 1
  • 25
  • 47
  • 1
    Perhaps `filter(lang(?vLabel)!='')` would be better. – Stanislav Kralin Apr 10 '19 at 08:48
  • 1
    @StanislavKralin Great suggestion! Or even just `FILTER lang(?vLabel)`, since empty strings are “falsy”. I've updated the answer. – cygri Apr 10 '19 at 10:56
  • 1
    Actually I like the regex one, because I found that the variable v can be url or some value that does not have ```label``` but valid for the answer. If I filter it out with ```FILTER LANG``` those values are also filtered out. The regex above purely filter out ```Qxxx``` stuff that I was originally intended to. – Keanu Paik Apr 11 '19 at 13:24