1

I'm using this code

<?php 
    $feedUrl = 'http://www.infoextractor.org/upfiles/songlinks.txt.xml';
    $rawFeed = file_get_contents($feedUrl);
    $viewfile = new SimpleXmlElement($rawFeed);

    foreach ($viewfile->channel->item as $viewfileinfo):
        $title=$viewfileinfo->title;

        echo " <span>",$title,"</span> ";

    endforeach;
?>

to get me all the titles from the following link - http://www.infoextractor.org/upfiles/songlinks.txt.xml

It was working last week but now it is coming up with "Exception thrown String could not be parsed as XML".

Last week there were 8 less entries and it was working, I've deleted them to see if that was the issue but it is still throwing up the exception, maybe something in the xml file has generated differently?

William Perron
  • 485
  • 7
  • 16
user3386034
  • 55
  • 2
  • 7
  • I must admit, with the information provided, I'm unable to reproduce the problem you describe. The error disappeared. Please add the XML in question. – hakre Jun 30 '14 at 19:39

2 Answers2

1

Your code is missing proper error handling:

$rawFeed = file_get_contents($feedUrl);

There are different kind of errors here you're not dealing with:

  • file_get_contents fails and returns FALSE. You have to check for that, e.g.

    if ($rawFeed === FALSE) {
        throw new RuntimeException(
            'Deal with it: Unable to retrieve %s',
             $feedUrl
        );
    }
    
  • file_get_contents returns content that is not (valid) XML. That is you need to catch exceptions on creating the SimpleXmlElement:

    try {
        $viewfile = new SimpleXmlElement($rawFeed);
    } catch (Exception $e) {
        throw new RuntimeException('Deal with it: ' . $e->getMessage(), 0, $e);
    }
    

In any case you need to do the error handling your own. You can not expect that everything works just magically well all the time. Actually the opposite is the case, design for failure. Especially when you deal with remote resources.

hakre
  • 193,403
  • 52
  • 435
  • 836
0

The exception tells you exactly what is wrong: the feed URL sometimes returns malformed XML. This could be due to the user-generated content in the feeds sometimes having an illegal XML character, such as & or <. The owner of that XML feed should escape all user-generated content.

You can try to parse the content and 'fix' it, but that will be a pain without SimpleXml. A simpler solution would be to wait some time until the user-generated content with the offending character cycles out of the feed.

Note that unlike HTML parsers, standards-compliant XML parsers are required the throw an exception and stop parsing on invalid input.

dotancohen
  • 30,064
  • 36
  • 138
  • 197