I'm using Symfony 2.8 & DomCrawler to parse a web site and I'm having a problem reading data
attributes from a HTML entity. It might be as simple as a specific convention for data
attributes, but I've not been able to find any references or examples on the web that discuss how to retrieve data attributes via DomCrawler.
Here are the details:
I have encountered an instance of this construct in the HTML I am parsing (from another web site, so I can't modify this HTML):
<div class='slideshowclass' id='slideshow'>
<div data-thumb='http://www.example.com/thumbs/1.jpg'
data-src='http://www.example.com/thumbs/1.jpg'></div>
<div data-thumb='http://www.example.com/thumbs/2.jpg'
data-src='http://www.example.com/thumbs/2.jpg'></div>
<div data-thumb='http://www.example.com/thumbs/3.jpg'
data-src='http://www.example.com/thumbs/3.jpg'></div>
<div data-thumb='http://www.example.com/thumbs/4.jpg'
data-src='http://www.example.com/thumbs/4.jpg'></div>
<div data-thumb='http://www.example.com/thumbs/5.jpg'
data-src='http://www.example.com/thumbs/5.jpg'></div>
<div data-thumb='http://www.example.com/thumbs/6.jpg'
data-src='http://www.example.com/6.jpg'></div>
</div>
I'm using this code to search the block of div
's and return the data-src
values:
function getList( Crawler $pWebDoc ) {
$list = $pWebDoc->filter( 'div#slideshow');
if ( !$list )
return null;
$retlist = null;
$x = $list->count();
if ( $x > 0 ) {
/* @var $item Crawler */
$retlist = $list->children()->each( function (Crawler $item, $i ) {
return ( "$i:" . $item->attr( 'data-src' ));
});
}
return ( $retlist );
}
From the DomCrawler docs I expect the attr
function to return the data-src
attribute value, but it returns null; the return from my function being an array of 6 elements with just the number and not additional text.
Thanks in advance for your help.