I'm experimenting with the python module wikipedia
which is a wrapper for the wikipedia API. In particular I'm looking at the links
API, which as I understand should return a 'List of titles of Wikipedia page links on a page', i.e. all the references to other wikipedia pages within the text of the page I'm querying about. When I look at the result for the article on Google, I get a list of links as expected (wikipedia titles in JSON format). The problem is that there seem to be links listed there that do not appear on the Google page. I thought maybe it's including links to Google, but that doesn't work either, in particular, the third link returned in the JSON structure is to ADATA
. I don't see a link to ADATA anywhere on the Google page, nor a link to Google anywhere on the ADATA
page. Is this a bug or am I missing something obvious?
I believe this link is enough to reproduce the issue:
https://en.wikipedia.org/w/api.php?action=query&titles=Google&prop=links
The result I see looks like this:
{
"continue": {
"plcontinue": "1092923|0|Aardvark_(search_engine)",
"continue": "||"
},
"query": {
"pages": {
"1092923": {
"pageid": 1092923,
"ns": 0,
"title": "Google",
"links": [
{
"ns": 0,
"title": "111 Eighth Avenue"
},
{
"ns": 0,
"title": "2600: The Hacker Quarterly"
},
{
"ns": 0,
"title": "ADATA"
},
. . .
In python you can reproduce like this:
import wikipedia
wikipedia.page('Google').links
which produces output like this:
['111 Eighth Avenue',
'2600: The Hacker Quarterly',
'ADATA',
'AI Challenge',
'AKM Semiconductor, Inc.',
'AOL',
'API.AI',