5

Is there a way to extract all links from only the "See Also" section in a Wikipedia article through Wikpedia API?

I wondering a method but I'm not able to find one.

Termininja
  • 6,620
  • 12
  • 48
  • 49
Luca
  • 848
  • 1
  • 14
  • 33

2 Answers2

5

Yes, you can do it by using Wikipedia API with action=parse. For this goal we need two properties: sections and links. For example for Wikipedia article Chicago we use the next query to get the index of the section with name "See also":

https://en.wikipedia.org/w/api.php?action=parse&prop=sections&page=Chicago

From response we see that it is 43. Then we use that index to get the links only in this section:

https://en.wikipedia.org/w/api.php?action=parse&prop=links&page=Chicago&section=43

Note: The last response can include also links which come from some templates, as in our case Portal:Chicago and Portal:Illinois. If you want you can filter them by using namespace &ns=0 in your request.

Termininja
  • 6,620
  • 12
  • 48
  • 49
  • Is there a way to get all the sections and links from a wiki api endpoint and figure what link belongs to what section in our application logic? – Lazhar Feb 28 '18 at 18:00
0

Not directly though the API. MediaWiki tracks links on a per-page basis, it doesn't store information about which section the link comes from.

I think your best option is to get the HTML of the section, parse it and collect all <a href elements.

svick
  • 236,525
  • 50
  • 385
  • 514