I am making this request:
http://en.wikipedia.org/w/api.php?format=xml&action=query&titles=self-administration&prop=revisions&rvprop=content&rvparse=&rvsection=0
My goal is to get the plain-text from the intro of an article.
It gives me back some HTML in a XML file. After strip_tags
and preg_replace,
to remove references, I get this:
Self-administration is, in its medical sense, the process of a subject administering a pharmacological substance to him-, her-, or itself. [...] Cite error: There are tags on this page, but the references will not show without a {{Reflist}} template or a tag; see the help page.
I want to remove
Cite error: There are tags on this page, but the references will not show without a {{Reflist}} template or a tag; see the help page.
How can I get ride of that either with php (preg_replace
?) or in my initial query (ignoring errors?).