I go through the living people category on wikipedia and I collect page images,. The problem is, some images are stored on the wikimedia commons site, whereas some are stored on the original wikipedia:en site. I want to know where the image is stored (if it were stored somewhere else besides en:wiki and commons)
import pywikibot
enwiki = pywikibot.Site("en", "wikipedia")
commons = pywikibot.Site("commons","commons")
page1 = pywikibot.Page(enwiki, "50 Cent")
page2 = pywikibot.Page(enwiki, "0010x0010")
pageimage1 = page1.page_image()
pageimage2 = page2.page_image()
pageimage1.exists() //outputs False (50 Cent page image is stored on commons)
pageimage2.exists() //outputs True (0010x0010 page imaged is stored on wikipedia:en)
This is fine, I can check commons if the wikipedia .exists() outputs False, but I'm worried about a situation the image would be stored on a different site.
I've tried the Page.image_repository attribute, but this returns commons even though the page image does not exist there and is stored on wikipedia:en
Is there a way I can get the original site from the Page object? Because the only way I know this possible is to download the HTML page and parse it, which is way too complicated.