3

In lxml, I'm using xpath to select all of the tr's in a table (that has varying number of rows) except for the last two rows which contain gibberish.

Is there a pattern match that excludes the last two rows? I was looking through xpath tutorials and apparently there is an "except" operator and also a "last()," but can't seem to get my code working.

So far I have this. What do I add to this pattern to make it exclude the last two rows? The main problem is the number of tr's vary.

result = doc.xpath("//tr")

I guess I could turn this into a list and just remove the last two elements, but is there any easier/elegant solution?

Thanks in advance!

chesspro
  • 125
  • 2
  • 6
  • Good question, +1. See my answer for a pure XPath solution (a single one-liner expression) that selects all the wanted `tr` elements. :) – Dimitre Novatchev Feb 11 '11 at 17:43

2 Answers2

9

Use:

expressionSelectingTheTable/tr[not(position() > last() -2)]

where expressionSelectingTheTable should be substituted with a specific XPath expression that selects the table, for which the question is being asked (such as //table[@id='foo'])

This single XPath expression selects all tr children of the table parent, whose position is not one of the last two.

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • Hm interesting, didn't know you could do position() > last(). I thought you could only have one or the other. Thanks! – chesspro Feb 11 '11 at 19:45
  • 2
    @chesspro: It is actually `not(position() > last() -2)` , and yes, the `position()` and `last()` functions can take part in *any* XPath expression. Expressions like `not(position() = last())` are used very often. – Dimitre Novatchev Feb 11 '11 at 20:32
2
result = doc.xpath("//tr")[0:-2]

Should do the trick.

Philip Southam
  • 15,843
  • 6
  • 28
  • 20