I'm trying to access an attribute of a previous sibling, but it's proving difficult.
So basically the web page I'm trying to scrape is TERRIBLE and the anchor tags use crappy onclick instead of href. Stupid, I know. I'm trying to first find the anchor tag containing an onclick with the window.open('servletLinkJunkHere...')
, then move to the previous sibling, which is an img
tag, and extract the src
attribute from it.
<IMG SRC="images/warning.gif" ALT="blah blah blah" STYLE="position:relative;top:2px;cursor:help;">
<a href="#" onclick="javascript:window.open('servletLinkJunkHere...')>
And here's the xpath I'm trying to use:
$url_pre = 'a[onclick*="'servletLinkJunkHere...'"]/preceding-sibling::img/@src';
Any ideas on how I can accomplish this? I know it's possible, I'm just not totally proficient in xpath queries. Also, are there any good resources for learning all the nooks and crannies of xpath? Thanks!
EDIT: So this is what I have but it doesn't seem to be returning anything but an empty array.
$url_email = "EditNotificationInfoServlet?cb=on&id=" . $id . "&sessionId=1";
$url_pre = "a[contains(@onclick,'" . $url_email . "')]/preceding-sibling::IMG/@SRC";
$final_text = $crawler->filterXPath($url_pre)->each(function($crawler, $i) {
return $crawler->text();
});