InstaPaper API - /api/1/bookmarks/get_text

Question

I'm working the InstaPaper API

I'm using this string to pull the content of the article.

$Bookmark_Text = $connection->getBookmarkText($Bookmark['bookmark_id']);

Unfortunately it is pulling the entire html and basically putting the HTML structure in my HTML.

Example.

<html>
<head></head>
<body>
    <html>
    <head>Instapaper Title</head>
    <body>InstaPaper Article Content</body>
    </html>
</body>
</html>

Any thoughts on how to just get "Instapaper article content"

Thanks!

What language are you calling the API with? PHP? – freejosh May 19 '12 at 00:21 — freejosh, May 19 '12 at 00:21
Yes PHP. Will add to the tags. – Chris Olson May 19 '12 at 00:22 — Chris Olson, May 19 '12 at 00:22

score 1 · Answer 1 · answered Aug 27 '12 at 15:45

Here’s some JS code that extracts only the article and removes Instapaper’s stuff (top and bottom bar for example).

html.replace(/^[\s\S]*<div id="story">|<\/div>[^<]*<div class="bar bottom">[\s\S]*$/gim, '');

Be aware that it may change as Instapaper’s HTML output changes.

freejosh · Answer 2 · 2012-05-19T02:40:12.167

0

Use a parser to extract the contents of <body>. PHP has some built in, but there are others out there which might be easier to use.

This should do it if $Bookmark_Text is a valid HTML document.

$dom = new DOMDocument();
$dom->loadHTML($Bookmark_Text);
$body = $dom->getElementsByTagName('body')->item(0);
$content = $body->ownerDocument->saveHTML($body);

edited May 19 '12 at 02:40

answered May 19 '12 at 00:27

freejosh

11,263
4
33
47

None of these seem to be able to pull just everything in the body. – Chris Olson May 19 '12 at 00:54
Are you sure the HTML in your example is exactly what's returned by the API? I was able to create an example using `DOMDocument`, but because the `` has text in it, that's parsed as a `
` and put into the body.
– freejosh May 19 '12 at 02:33
added my code to the answer. If the returned document isn't valid HTML, maybe your only choice is trying a regular expression – freejosh May 19 '12 at 02:41

InstaPaper API - /api/1/bookmarks/get_text

2 Answers2