2

I've been trying to understand the MediaWiki documentation for the past 2 days and I can't figure out how to retrieve the first paragraph of a Wikipedia article through the MediaWiki API.

Could someone point me to the right direction?

I am about to appeal to file_get_contents, but I'm confident there's a "cleaner" solution.

svick
  • 236,525
  • 50
  • 385
  • 514
Russ Ted
  • 87
  • 1
  • 8

2 Answers2

2

file_get_contents is pretty clean, you get the HTML code. You can then parse the html code using DOMDocument. DOMDocument works as javascript, you can fetch all <p>'s in a div for example. Or grab the first one.

for example:

$html = file_get_contents('the url');

$dom = new DomDocument();
@$dom->loadHTML($html);

$p = $dom->getElementsByTagName('p')->item(0)->nodeValue;
Andrei
  • 1,183
  • 2
  • 19
  • 40
1

Don't try to use the raw API, instead use a client wrapper. Here's a long list to choose from, all for PHP:

http://en.wikipedia.org/wiki/Wikipedia:PHP_bot_framework_table

lambshaanxy
  • 22,552
  • 10
  • 68
  • 92