Given any page on Wikipedia, such as the one for Coffee, I'm trying to figure out how to extract a list of all references (including any metadata) on the page. At first glance it seems this would be easy, since most pages list them all under a section called References. However, when you examine the wikitext of those pages you find that References is just a pointer to the ref
template, which I believe generates them dynamically from all of the entries throughout the text on the page.
When I examine the wikitext from sections of text that are connected to each reference, I find that they are enclosed in <ref></ref>
tags. The content between these tags is dependent on citation type.
So one strategy would be to query all content of the page and do my own parsing to find all <ref></ref>
pairs. However, I'm thinking there must be a way to do this within the Mediawiki API that I'm not finding. Is there a way? I'd rather pull all of this from wikitext or something other than the final HTML as I expect the former would be more stable.