2

I have a WikiData id corresponding to a Wikipedia disambiguation page, for instance Q1811449. I want to get the ids of the entities listed on this page.

Is it possible to do so using the WikiData API? I could not find any property in Q1811449 that I could use for this purpose.

If it is not possible, does anyone know another way? I was thinking of retrieving the corresponding Wikipedia disambiguation page, extracting the internal links it contains, and looking them up on WikiData. But maybe is there a simpler way?

Vincent Labatut
  • 1,788
  • 1
  • 25
  • 38

1 Answers1

2

Based on this SO answer, this is the solution I have so far.

I query the WikiMedia API (not WikiData) using the entity label stored on WikiData for the considered disambiguation page (in the example used in the question, it was "Lecointe"). With the appropriate parameters, it is possible to get the ids of the entities listed in the page: https://fr.wikipedia.org/w/api.php?action=query&generator=links&format=xml&redirects=1&titles=Lecointe&prop=pageprops&gpllimit=50&ppprop=wikibase_item

where:

  • titles=Lecointe is the label of the disambiguation page;
  • format=xml obviously specifies the output format;
  • redirects=1 automatically solves redirections;
  • generator=links, prop=pageprops, gpllimit=50 and ppprop=wikibase_item allow getting the ids;

Still, I'd be glad if someone knows a solution using only Wikidata.

Vincent Labatut
  • 1,788
  • 1
  • 25
  • 38
  • There isn't one; items describe entities, disambiguation pages describe words. That disambiguation pages have wikidata items at all is an internal quirk of the system (since Wikidata is also used to create navigation links between Wikipedias of different languages, and being able to navigate between disambiguation pages for the same word in multiple languages can be helpful), they don't really fit in meaningfully. – Tgr Mar 22 '17 at 06:09
  • How to fetch the list of disambiguated pages from the disambiguation page is something that changes from wiki to wiki. Some wikis take care to only link to the disambiguated pages from the disambiguation page (and not miscellaneous words), some put such links in bold, some don't even do that. – Tgr Mar 22 '17 at 06:11
  • Thanks, that's good to know. I've tested my method on a few pages of the the English and French versions of Wikipedia, it seems to work. But it might not work with other languages, or even with other FR/EN pages. Is that what you mean? – Vincent Labatut Mar 22 '17 at 08:02
  • Yeah. See e.g. https://da.wikipedia.org/wiki/Skygge_(flertydig) or https://nl.wikipedia.org/wiki/Arm which have "secondary" links. You can look at the style guide of the given wiki, e.g. https://en.wikipedia.org/wiki/Wikipedia:Disambiguation#Page_style but that's a big effort if you plan to process data from many wikis. – Tgr Mar 22 '17 at 17:39
  • The other thing to be aware of is that often the article for a meaning does not exist, and the disambiguation page links to an article with a much wider subject. So for example [DDB](https://en.wikipedia.org/wiki/DDB) links to [List of filename extensions (A–E)](https://en.wikipedia.org/wiki/List_of_filename_extensions_(A%E2%80%93E)) because the file format DDB does not have its own article. – Tgr Mar 22 '17 at 17:45
  • I hadn't think about that, thanks. I'll run some other tests to see how this affects the kinds of entities I'm dealing with. – Vincent Labatut Mar 22 '17 at 18:41