2

How do I skip the item in an rss feed if the description text does not exist?

I tried with this line below but I still get an error,

if($x -> item($i) -> getElementsByTagName('description')) { then proceed }

when I the item in the rss feed appears to be like this,

<item>
    <title>Berlusconi on the brink</title>
    <link>http://www.spectator.co.uk/coffeehouse/7375408/berlusconi-on-the-brink.thtml</link>
    <guid isPermaLink="false"> http://www.spectator.co.uk/coffeehouse/7375408/berlusconi-on-the-brink.thtml </guid>
    <pubDate>Tue, 08 Nov 2011 16:42:03 +0000</pubDate>
</item>

As you can see that <description> not is not provided.

Below is the function I came up with...

function feed_reader($url,$limit = 1,$title =  null)
{
    $xml = ($url); //http://twitter.com/statuses/friends_timeline/321998072.rss
    $xmlDoc = new DOMDocument();
    $xmlDoc -> load($xml);

    # Get and output "<item>" elements.
    $x = $xmlDoc -> getElementsByTagName('item');

    # Count the total feed with xpath.
    $xpath = new DOMXPath($xmlDoc);
    $total_feed = $xpath->evaluate('count(//item)');

    # Set feed limit.
    $limit_feed = $limit;

    # Check if the total feed is less than the limit then use the total feed as the limit.
    if($limit_feed >= $total_feed) $limit_feed = $total_feed;

    # Set the variable.
    $output = null;

    for ($i=0; $i<$limit_feed; $i++)
    {
                # Make sure that the description node exist then process.
        if($x -> item($i) -> getElementsByTagName('description'))
        {
            $item_title = $x -> item($i) -> getElementsByTagName('title') -> item(0) -> childNodes -> item(0) -> nodeValue;
            $item_link = $x -> item($i) -> getElementsByTagName('link') -> item(0) -> childNodes -> item(0) -> nodeValue;
            $item_date = $x -> item($i) -> getElementsByTagName('pubDate') -> item(0) -> childNodes -> item(0) -> nodeValue;
            $item_description = $x -> item($i) -> getElementsByTagName('description') -> item(0) -> childNodes -> item(0) -> nodeValue;

            # NOTE: use this code for the server runs PHP5.3
            # DateTime::add — Adds an amount of days, months, years, hours, minutes and seconds to a DateTime object
            $date = new DateTime($item_date);

            # change the date format into Y-m-d H:i:s
            $item_date = $date -> format('j F Y');

            # count time ago from the published date
            $time_ago = time_ago($date -> format('Y-m-d H:i:s'),'d M Y \a\t H:i');

            if($title) $output .= '<li><p>'.preg_replace('/^('.$title.':)/', ' ',limit_length(strip_tags($item_description), 200)).'<br/>'.$time_ago.'</p></li>';
                else $output .= '<li><p>'.limit_length(strip_tags($item_description), 200).'<br/>'.$time_ago.'</p></li>';

        }
    }

    return $output;

}
Run
  • 54,938
  • 169
  • 450
  • 748

2 Answers2

0

getElementsByTagName() will return a DOMNodeList even if it is empty. You need to check whether it's empty. Alternatively, you can use xpath, as in

$xpath = new DOMXPath($x->ownerDocument);
if ($xpath->evaluate('count("description")', $x)) { ...
Explosion Pills
  • 188,624
  • 52
  • 326
  • 405
  • thanks. I get this error `Notice: Undefined property: DOMNodeList::$ownerDocument in C:\wamp\www\global_tolerance_2011_MVC\local\views\includes\templates\mosaic-pulse.php on line 43 Call Stack` what is `ownerDocument`?? Thanks. – Run Nov 08 '11 at 17:53
  • Sorry, `$x->item($i)->ownerDocument;` – Explosion Pills Nov 08 '11 at 21:16
  • its ok. but now i get a number of error like `Fatal error: Maximum execution time of 30 seconds exceeded in C:\wamp\www\global_tolerance_2011_MVC\local\views\includes\templates\mosaic-pulse.php on line 9 Call Stack` or `Warning: DOMDocument::load() [domdocument.load]: Premature end of data in tag source line 133 in http://twitter.com/statuses/user_timeline/globaltolerance.rss, line: 133 in C:\wamp\www\global_tolerance_2011_MVC\local\views\includes\templates\mosaic-pulse.php on line 9 Call Stack` have u tested the code? thanks? – Run Nov 08 '11 at 22:05
  • @lauthiamkok that's a completely separate issue; just seems like the RSS is not being parsed properly. – Explosion Pills Nov 08 '11 at 22:06
  • I get this error this time `Catchable fatal error: Argument 2 passed to DOMXPath::evaluate() must be an instance of DOMNode, instance of DOMNodeList given, called in C:\wamp\www\test\2011\php\feed_reader\reader_4.php on line 67 and defined in C:\wamp\www\test\2011\php\feed_reader\reader_4.php on line 37` – Run Nov 08 '11 at 22:11
  • Second argument also needs to be `$x->item($i);` – Explosion Pills Nov 08 '11 at 23:05
  • got it. changed it and it gets better but with this looping error - `Warning: DOMXPath::evaluate() [domxpath.evaluate]: Invalid type in C:\wamp\www\test\2011\php\feed_reader\reader_4.php on line 37 Call Stack` what do i still miss? thanks. – Run Nov 08 '11 at 23:14
  • `$x->item($i)` is probably null. You should check that first. – Explosion Pills Nov 08 '11 at 23:16
0

Just use a single xpath query to return all items that do have the description (as you look only for those):

$xmlDoc = new DOMDocument();
$xmlDoc->loadXML($xml);
$xpath = new DOMXPath($xmlDoc);
$items = $xpath->query('/rss/item/description/..');

# Set the variable.
$output = '';

foreach($items as $number => $item)
{
    if ($number+1 >= $limit_feed) break; # enough items, limit reached

    # process each item as you see fit:

    echo $xmlDoc->saveXML($item), "\n"; # debug only for demo

    $item_title = $item->getElementsByTagName('title')->item(0)->childNodes->item(0)->nodeValue;

    # and so on ....

    if ($title) 
        $output .= '<li><p>'.preg_replace('/^('.$title.':)/', ' ',limit_length(strip_tags($item_description), 200)).'<br/>'.$time_ago.'</p></li>';
    else
        $output .= '<li><p>'.limit_length(strip_tags($item_description), 200).'<br/>'.$time_ago.'</p></li>';

}

return $output;

Demo

hakre
  • 193,403
  • 52
  • 435
  • 836
  • thanks hakre. sorry I still don't know how to fit your code into mine in the function... – Run Nov 08 '11 at 18:24
  • This does all of your function *but* concatenating the `$output`. The code example in the answer already checks for your `$limit_feed` parameter, so to not process more than X elements (if there are more). – hakre Nov 08 '11 at 18:29
  • I use `return` but not `echo` in my function. – Run Nov 08 '11 at 18:32
  • also I grab the date, title, and link from each item. – Run Nov 08 '11 at 18:33
  • @lauthiamkok: I edited the answer so it's more clear. This will reduce the amount of code you have in your function, but inside the loop / foreach it stays pretty much the same (you use `$x` instead of `$item`, see the update). – hakre Nov 08 '11 at 18:35
  • thanks so much for the edit. sorry I still have the problem with it though... please have a look here - http://codepad.org/fg9HT4jQ – Run Nov 08 '11 at 18:55
  • I can't use your RSS feed because it's protected, so I can not test this easily. What is the issue? – hakre Nov 08 '11 at 19:00
  • I even tried with a simple xml code but getting an error, http://codepad.org/luR29SuG – Run Nov 08 '11 at 19:02
  • use `->load($url)` with an URL, `->loadXML($xml)` for XML as text. The error you get is because you use a) because the $url is not accessible or b) because you try to load XML as text with the `->load()` function. – hakre Nov 08 '11 at 19:05
  • Then I know it is not working - I need to load an URL but not XML - http://codepad.org/0kOZJXms this one works because I load XML but it won't work if I load an url... the problem is I need to load an url actually. – Run Nov 08 '11 at 19:22
  • it is now working when I changed this line to - `$items = $xpath->query('/rss/channel/item/description/..');` – Run Nov 08 '11 at 19:30
  • Yes, that's the xpath query. I simplified it for the example, sorry that I didn't put a notice on it. And cool to read you got it to work. This xpath is really cool to fetch the elements you're interested in. – hakre Nov 08 '11 at 19:43