I'm trying various options of creating a BHO for traversing HTML page DOM. One implementation uses C# and it's registered in registry with ApartmentModel
set to Both
. It goes like this:
- retrieve
IWebBrowser2.Document
- obtain
IDocumentSelector
interface from document object - invoke
IDocumentSelector.querySelectorAll("*")
which yields aIHTMLDOMChildrenCollection
reference - get
IHTMLDOMChildrenCollection.length
- run the for-loop in
0..length
range (for(int index = 0; index < totalCount; index++)
), - inside loop iteration obtain each collection item using
IHTMLDOMChildrenCollection.item()
, - cast the collection item reference to
IHTMLElement2
, - obtain
IHTMLElement2.getClientBoundingRect()
and it works rather fine, a page with about 1500 elements gets traversed in 200-300 milliseconds (loop duration is measured by reading DateTime.UtcNow
before and after the loop and getting TotalMilliseconds
from the readings difference).
Another implementation in done with Visual C++ and ATL. It does mostly the same as the C# version. CComQIPtr
is used in place of casts. The loop is the same. It's also registered with ApartmentModel
set to Both
.
The C++ implementation traverses the very same page DOM in 40-60 milliseconds. Time is measured by reading GetTickCount()
before and after the loop and getting the difference.
Then I exclude the step 8 from inside loop iteration - item is obtained and IHTMLElement2
is obtained from it but getClientBoundingRect()
is not invoked. After this change both implementations run in mostly the same time - 40-50 milliseconds.
This looks weird. Why would only getClientBoundingRect()
be affected? What's so special in it that it slows down so much?