4

I would like to use the wikipedia API to return the extract from multiple wikipedia articles at once. I am trying, for example, the following request (I just chose the pageids randomly):

http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=3258248|11524059&prop=extracts&exsentences=1

But it only contains the extract for the first pageid, and not the second. Other properties seem not to have this limitation. For example

http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=3258248|11524059&prop=categories

will return the categories for both pageids. Is this a bug, or am I missing something?

Ian Hincks
  • 3,608
  • 3
  • 23
  • 20

1 Answers1

13

Notice the <query-continue> element. It tells you that to get more of the extracts, you need to specify excontinue=1:

http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=3258248|11524059&prop=extracts&exsentences=1&excontinue=1

You should be able to get both of them, by specifying exlimit=max:

http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=3258248|11524059&prop=extracts&exsentences=1&exlimit=max

But this does not seem to work correctly, I'm not sure why.

BTW, categories have similar limitations, which is why your categories query has <query-continue> too and why it doesn't list all categories of the articles.

svick
  • 236,525
  • 50
  • 385
  • 514
  • 4
    Ah, thank you for pointing me towards `exlimit=max` -- it ended up being the key -- HOWEVER, for a reason I don't understand, I also had to set `exintro`. The combination of these two made it fetch all extracts. Looking around for a while, I don't think `excontinue` is a boolean variable, I think it is a more complicated multi-page sort of deal, so I just left it blank. So my whole query is: `http://en.wikipedia.org/w/api.php?format=xml&action=query&pageids=11524059|3258248&prop=extracts&exlimit=max&explaintext&exintro` – Ian Hincks Mar 24 '12 at 00:59
  • 1
    Yes, `excontinue` is just a value that you can use to the the next page, it has no other meaning. And now I understand the limit, [the documentation on mediawiki.org](https://www.mediawiki.org/wiki/Extension:MobileFrontend#prop.3Dextracts) says: “Because excerpts generation can be slow, the limit is capped at 20 for intro-only extracts and 1 for whole-page extracts.” – svick Mar 24 '12 at 03:32
  • This answer doesn't seem to work anymore. I still face the same problem, but the API doesn't recognize `eslimit=max`. – user666 Nov 02 '16 at 19:29
  • @user666 Yeah, looks like it: "exlimit was too large for a whole article extracts request, lowered to 1". It seems you will have to use continue instead. – svick Nov 02 '16 at 22:50
  • Is using `continue` preferred over making a new request for every single pageid? Could not seem to find information in the documentation about this. – user666 Nov 03 '16 at 19:59