2

I am using Wikipedia-API 0.5.4, and I would like to retrieve the item ID for the item being discussed on a given page. Is it possible to do this using the data returned from a page query?

I am able to retrieve the pageid. However, pages in different languages about the same item do not have the same pageid, but they do refer to a single item a unique item ID.

In the example below, the pageid for the English language page on the singer Cher is different from the pageid for the corresponding French language page, while the item ID for "Cher" should be the same in both cases.

Is the item ID not accessible from the page object?

import wikipediaapi as wp
wp_en = wp.Wikipedia('en')
cher_en = wp_en.page('Cher')

print(cher_en.pageid)
> 80696

print(cher_en.langlinks['fr'].pageid)
> 339022
Tashus
  • 207
  • 2
  • 9
  • @JeanHominal It is my understanding that the item ID is a unique reference to a real world person, place, thing, concept, etc. Cher is Cher, regardless of whether there is an article written about her in English or in French. – Tashus Apr 09 '21 at 17:13

1 Answers1

0

I ended up using the requests library to use the Wikipedia REST API directly. Including prop=pageprops will return the item ID, which is shared across different languages.

import requests as rq

request_str = 'https://en.wikipedia.org/w/api.php?action=query&prop=pageprops&titles=Cher&format=json'
resp = rq.get(request_str)
resp.text.split('wikibase_item":"')[1].split('"')[0]
> 'Q12003'

fr_str = 'https://fr.wikipedia.org/w/api.php?action=query&prop=pageprops&titles=Cher_(artiste)&format=json'
fr_resp = rq.get(request_str)
fr_resp.text.split('wikibase_item":"')[1].split('"')[0]
> 'Q12003'
Tashus
  • 207
  • 2
  • 9