this the html page:
<div class="gs_ri">
<h3 class="gs_rt">
<span class="gs_ctc">
<span class="gs_ct1">[BOOK]</span>
<span class="gs_ct2">[B]</span></span>
<a href="http://example.com" onmousedown="">Title</a></h3>
<div class="gs_a">A</div>
<div class="gs_rs">B</div>
<div class="gs_fl"><a href="">C</a> <a href="">D</a> <a href=""</a></div></div></div>
<div class="gs_r"><div class="gs_ggs gs_fl"><button type="button" id="gs_ggsB2" class="gs_btnFI gs_in_ib gs_btn_half">
<span class="gs_wr"><span class="gs_bg"></span>
<span class="gs_lbl"></span>
<span class="gs_ico"></span></span></button>
<div class="gs_md_wp" id="gs_ggsW2"><a href="http://example.pdf" onmousedown=""
I'm a little confused to determine the node.
I wanna get http://example.com
and Title
I thought there are 2 ways to get them:
it's a sibling of the <span>
:
foreach($html->find('span[class=gs_ctc2] ') as $link){
$link = $link->next_sibling();
echo $link->plaintext;
echo $link->href;
}
but it does not work.
the second, I take <h3 class="gs_rt">
as parent, so it's the sibling of last child
foreach($html->find('h3[class=gs_rt] a') as $link){
$link = $link->last_child()->next_sibling();
echo $link->plaintext;
echo $link->href;
}
it also does not work. I think that I am not understanding yet abot node dom tree.