2

I would like to obtain template data of wikipedia pages. I have tried several api commands such as parse, query, expandtemplates etc, but have not been able to obtain all the information that I was looking for.

For example, the page about Abraham Lincoln: http://en.wikipedia.org/wiki/Abraham_Lincoln.

I querying which templates exist for this page like so: http://en.wikipedia.org/w/api.php?action=query&prop=templates&format=jsonfm&tllimit=500&titles=Abraham_Lincoln

There are many templates. In particular I am interested in the "infobox" templates. If I understand the results correctly, there are 6 infobox templates:

  • "Template:Infobox U.S. Cabinet"
  • "Template:Infobox cabinet members"
  • "Template:Infobox cabinet members/row"
  • "Template:Infobox officeholder"
  • "Template:Infobox officeholder/Office"
  • "Template:Infobox officeholder/Personal data"

Now comes the hard part. If I use 'query' API like so: http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=json&titles=Abraham Lincoln&continue=&rvgeneratexml=

I get many templates but only 2 Infobox templates of the 6 above.

I also tried to use the 'expandtemplates' and rvexpandtemplates as recommended here: How to get wiki template's content?.

Also tried to use 'parse' as described here: How to get the result of a complex Wikipedia template?.

So my question is: how do I invoke the wiki api to retrieve the body of a particular template that I know which exist on a particular title? (e.g. how to get the "Template:Infobox cabinet members" of Abraham Lincoln).

If that is not possible, then how do I get all 6 infobox templates for that particular page?

Thanks.

Community
  • 1
  • 1
ewolfman
  • 361
  • 3
  • 4
  • 18
  • *If I understand the results correctly, there are 6 infobox templates...* - not really. There are two infobox templates (U.S. Cabinet and officeholder); the rest are various "subroutines" that implement parts of those templates. In general, it's easier to give helpful answers if you explain what your goal is instead of what you currently consider the best way of achieving it. – Tgr Apr 27 '15 at 08:10
  • Have you tried using dbpedia.org? It contains all structured content on wikis – John Strood Jul 26 '18 at 10:52

2 Answers2

2

a] First get the template name in double curly brackets:

{{Template: Name}}

b] Use 'expandtemplates' API call with all parameters:

https://en.wikipedia.org/w/api.php?action=expandtemplates&text={{Template: Name}}&prop=wikitext&title=Page Title
0

You can parse the content of a template as if it was included on a page, using action=parse. Just provide the title of the page you want to act as (in you case Abraham Lincoln), and use contentmodel=wikitext to pass the wikitext of the template, like this (actual wikitext omitted, for obvious reasons):

https://en.wikipedia.org/w/api.php?action=parse&contentmodel=wikitext&title=Abraham Lincoln&prop=text&text=<table class="infobox ... snip
leo
  • 8,106
  • 7
  • 48
  • 80
  • sorry, but this doesn't really help. the whole point is that i don't have the infobox text of that template. – ewolfman Apr 26 '15 at 20:52
  • 2
    For that you will have to parse the wikitext of the content article (Abraham Lincoln) yourself. Templates in MediaWiki are just wikitext, so there is no way for the API to know what is inside and outside the template. – leo Apr 27 '15 at 07:46