5

I am creating a xsl stylehseet and came up with this (in my opinion illogical behavior):

This XPath:

/root/element[1][@attr1 != '1' or @attr2 != 'test']

is WAY slower than this XPath:

/root/element[count(preceding-sibling::element) + 1 = 1) and (@attr1 != '1' or @attr2 != 'test')]

I have 50 sample xml and with the first XPath it takes ~55sec.
With the second XPath it takes ~4sec!

I use the XslCompiledTransform (C# .NET 4.5).

Can someone explain why the first XPath is THAT much slower than the second? I always thought it is better to use the explicit index filter.

Update: Some sample xml:

<?xml version="1.0" encoding="iso-8859-1"?>
<root>
<element attr2="test" attr1="1">
    <child>17</child>
    <child>17</child>
    <child>16</child>
    ...
    <child>3</child>
    <child>2</child>
    <child>1</child>
</element>
<element attr2="test2" attr1="2">
    <child/>
    <child/>
    <child/>
    <child/>
    <child/>
    <child/>
    <child/>
    ...
    <child/>
</element>
....
<element attr2="test21" attr1="21" />

There are only like 20-25 elements with n childs but the depth maximum is 4 (/root/element/child/anotherChild).

AlteGurke
  • 585
  • 6
  • 20
  • It's certainly surprising. But I wouldn't use the word "illogical". Is the number of element children under root very large? In that case it's possible that the first expression is looking at all the elements, while the second stops after the first match (is the expression being used in a context where only the first match is needed?) – Michael Kay Jun 12 '15 at 07:54
  • I added some sample xml, maybe you can explain why the XPath with the explicit filter is much slower than the XPath with the count(preceding-sibling::element). – AlteGurke Jun 18 '15 at 13:17
  • Sorry, I don't know the internals of Microsoft's XPath processor so I can't explain anything about its optimization strategy. – Michael Kay Jun 18 '15 at 14:32

1 Answers1

0

I came to the solution that I just have to accept this. Microsoft says https://support.microsoft.com/en-us/kb/815124:

All versions of MSXML, Version 3.0 and later, are faster with the explicit index filter. The improvement in performance depends on the position of the element in the child list of the parent. Instead of using the following:

/child_element

use the following:

/child_element[1]

In my case, the first example is WAY faster than the recommandation from microsoft.

Community
  • 1
  • 1
AlteGurke
  • 585
  • 6
  • 20