3

I am trying to grab the data between two css class tag in my html doc.

Here the example.

<p class="heading10">text text</p>
<p>text text text</p>
<p>text text text</p>
<p class="heading11">text text</p>
<p></p>
<p></p>

I don't know how to grab the

data between that

class heading10 and heading11.

I tried //p[@class="heading10"]//following-sibling::p], it will grab all <p> after the class heading10.

GrandMasterFlush
  • 6,269
  • 19
  • 81
  • 104
slphp
  • 43
  • 5
  • I doubt this is possible with an xpath query... but don't quote me there. However, you could grab the node above this and loop through it's children- storing each element to an array after the first instance of the class heading10, and stopping when you find the class heading11. Then, you can use your new array to parse the HTML however you need. – Mikel Bitson Nov 18 '15 at 21:37
  • Mikel, any simple code. i am little lost.thx – slphp Nov 18 '15 at 22:18
  • this is what get so far. $pTag1 = $tPageXpath->query('//*[@class="Heading10"]/following-sibling::p'); i got total of 5 element. now sure how to the stop that loop when the element hit the heading11. – slphp Nov 18 '15 at 22:23

1 Answers1

1

Try something like

//p[@class="heading10"]/following-sibling::p[position()<count(//p[@class="heading11"]/preceding-sibling::p)]

EDIT:

A bit more explanations for @jpaugh:

The OP's xpath grabs all sibling p elements after the one with class="heading10". I have added the restriction for position() of the these elements to be less than position of the p element with class="heading11".

Following code is confirmed to be working with php 5.5, and does not work with php 5.4(thanks @slphp):

$t = '<?xml version="1.0"?>
<root><p class="heading10">text text</p>
<p>text text text</p>
<p>text text text</p>
<p class="heading11">text text</p>
<p></p>
<p></p></root>';

$d = DOMDocument::LoadXML($t);
$x = new DOMXpath($d);
var_dump($x->query('//p[@class="heading10"]/following-sibling::p[position()<count(//p[@class="heading11"]/preceding-sibling::p)]'));


class DOMNodeList#6 (1) {
  public $length =>
  int(2)
}

Please note, that if <p class="heading10"> is not the first p element, than you probably need to subtract them:

//p[@class="heading10"]/following-sibling::p[position()<(count(//p[@class="heading11"]/preceding-sibling::p) - count(//p[@class="heading10"]/preceding-sibling::p))]

Splitting by lines for the sake of readability:

//p[@class="heading10"]
 /following-sibling::p[
     position()<(
         count(//p[@class="heading11"]/preceding-sibling::p) -
         count(//p[@class="heading10"]/preceding-sibling::p)
     )
  ]
Alex Blex
  • 34,704
  • 7
  • 48
  • 75
  • 1
    This comment could use more explanation (such as how it differs from the OP's xpath). – jpaugh Nov 19 '15 at 00:16
  • jpaugh,, what is OP's xpath? i tried the Alex's suggestion. When i run the script, it's grabbing the 5

    elements below the

    . Is this help?

    – slphp Nov 19 '15 at 00:19
  • @user2574693 "OP's xpath" is the one in the question: `//p[@class="heading10"]//following-sibling::p]`. Basically your one. Please note I have edited it in the question because of a typo. – Alex Blex Nov 19 '15 at 09:48
  • Hi Alex, thanks for the detail code simple. I am testing now on my PHP 5.4.16 and doesn't work.. I am going to try update my php and try it again. Keep you post. – slphp Nov 20 '15 at 19:48
  • Alex's sample code work perfectly. I just tested out and it's able to grab what i am looking for. – slphp Nov 20 '15 at 20:16
  • @slphp, glad to know it helped. You can accepted the answer to indicate it is correct. – Alex Blex Nov 21 '15 at 12:42