0

I'm trying to select the main menu ID of this page http://greyhoundstats.co.uk/index.php labeled ("menu_wholesome") in order to get their hyperlinks later on. In the HTML document, there are two tags with this ID, a <div> and its child element <ul>, but when i search for them with the code below, i get the object variable not set" error.

Option Explicit

Public Const MenuPage As String = "http://greyhoundstats.co.uk/index.php"

Sub BrowseMenus()

Dim XMLHTTPReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument

Dim MainMenuList As MSHTML.IHTMLElement
Dim aElement As MSHTML.IHTMLElementCollection
Dim ulElement As MSHTML.IHTMLUListElement
Dim liElement As MSHTML.IHTMLLIElement

XMLHTTPReq.Open "GET", MenuPage, False
XMLHTTPReq.send

HTMLDoc.body.innerText = XMLHTTPReq.responseText

    Set MainMenuList = HTMLDoc.getElementById("menu_wholesome")(0) '<-- error happens here

End Sub

Anyone knows why getElementsById can't find the refered ID, although it is part of the HTML document set? I know that this method is supposed to return a unique ID, but when we have the same one refered by other tags i also know that i will return the first ID found which should be the <div id="menu_wholesome"> part of the HTML page being requested.

thiggy01
  • 129
  • 1
  • 11

2 Answers2

1

It is unclear what are you trying to achieve. I just fixed the current problem you are having at this moment. .getElementById() deals with an individual element so when you treats it as a collection of element then it will throws that error. If you notice this portion getElementBy and getElementsBy, you can see the variation as to which one is a collection of elements (don't overlook the s). You can only use (0) or something similar when you make use of getElementsBy.

You should indent your code in the right way so that others can read it without any trouble:

Sub BrowseMenus()
    Const MenuPage$ = "http://greyhoundstats.co.uk/index.php"
    Dim HTTPReq As New XMLHTTP60, HTMLDoc As New HTMLDocument
    Dim MainMenuList As Object

    With HTTPReq
        .Open "GET", MenuPage, False
        .send
        HTMLDoc.body.innerHTML = .responseText
    End With

    Set MainMenuList = HTMLDoc.getElementById("menu_wholesome")
End Sub
SIM
  • 21,997
  • 5
  • 37
  • 109
1

Firstly: You want to work and set the innerHTML as you intend to traverse a DOM document.

Secondly: This line

Set MainMenuList = HTMLDoc.getElementById("menu_wholesome")(0)

It is incorrect. getElementById returns a single element which you cannot index into. You index into a collection.

Please note: Both div and and ul lead to the same content.

If you want to select them separately use querySelector

HTMLDoc.querySelector("div#menu_wholesome")
HTMLDoc.querySelector("ul#menu_wholesome")

The above target by tag name first then the id attribute.

If you want a collection of ids then use querySelectorAll to return a nodeList of matching items. Ids should be unique to the page but sometimes they are not!

HTMLDoc.querySelectorAll("#menu_wholesome")

You can then index into the nodeList e.g.

HTMLDoc.querySelectorAll("#menu_wholesome").item(0)

VBA:

Option Explicit

Public Const MenuPage As String = "http://greyhoundstats.co.uk/index.php"
Sub BrowseMenus()
    Dim sResponse As String, HTMLDoc As New MSHTML.HTMLDocument
    Dim MainMenuList As Object, div As Object, ul As Object

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", MenuPage, False
        .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
        .send
        sResponse = StrConv(.responseBody, vbUnicode)
    End With

    sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
    HTMLDoc.body.innerHTML = sResponse

    Set MainMenuList = HTMLDoc.querySelectorAll("#menu_wholesome")

    Debug.Print MainMenuList.Length

    Set div = HTMLDoc.querySelector("div#menu_wholesome")
    Set ul = HTMLDoc.querySelector("ul#menu_wholesome")

    Debug.Print div.outerHTML
    Debug.Print ul.outerHTML

End Sub
QHarr
  • 83,427
  • 12
  • 54
  • 101
  • 1
    As always, a perfect class about this issue. You have a gift to explain to newbies like me. This is clear now. **Thank you very much** for you kind answer. – thiggy01 Sep 28 '18 at 10:18
  • My pleasure. Glad to help. Ids should be unique but I frequently find they are not and then querySelectorAll is a way to gather them in a nodeList. – QHarr Sep 28 '18 at 10:19
  • 1
    The QuerySelector method is exactly what i was looking for. – thiggy01 Sep 28 '18 at 10:24
  • Was there something missing from the answer please? The method above is basically what you have now put in your answer except you aren't catering for possible encoding of content nor possible cache retrieval of results. – QHarr Sep 28 '18 at 11:49
  • I liked your answer very much, but the object variable not set error was caused by `.innerText` instead of `.innerHTML`. I'm selecting it as the main answer. Thanks – thiggy01 Sep 28 '18 at 22:06
  • Hi, that is exactly what I said in my first line with: You want to work and set the innerHTML as you intend to traverse a DOM document. This is what I do with HTMLDoc.body.innerHTML = sResponse except I first ensure that the responseText is decoded as an additional safety measure. – QHarr Sep 28 '18 at 22:07
  • Ohh, now I see. I didnt understand what your code was doing. Sorry my ignorance. – thiggy01 Sep 28 '18 at 22:11
  • I should maybe have been more explicit. If you have any questions about my code please feel free to ask. – QHarr Sep 28 '18 at 22:11
  • I see u use `.responseBody` converted to unicode as opposed to `.responseText` that I'm using, but I dont understand why and I also dont undertstand why did u used mid and instr to get the position of " – thiggy01 Sep 28 '18 at 22:47
  • Body property on an XMLHTTPRequest response is always a nice usable parsed js object. This property can be useful when the response body is binary or when it is text. You don't have to use the Instr part. It is just habit on my part. – QHarr Sep 28 '18 at 22:51