I want Excel to parse an HTML file for a specific table.
My current method is to get a DOM representation of the file and parse that. The problem is that the DOMDocument60 is throwing a parse error ("Invalid Syntax"). After some more research I found out that the DOMDocument60 object is only compatible with XML.
Are there any other options to get the DOM of an HTML file?
Sub myWebTest()
On Error Resume Next
Set File = CreateObject("Msxml2.XMLHTTP")
File.setTimeout 2000, 2000, 2000, 2000
File.Open "GET", "http://www.microsoft.com/en-au/default.aspx:80", False
'This is IE 8 headers
File.SetRequestHeader "User-Agent", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 1.1.4322; .NET CLR 3.5.30729; .NET CLR 3.0.30618; .NET4.0C; .NET4.0E; BCD2000; BCD2000)"
File.Send
On Error GoTo 0
Set dom = CreateObject("Msxml2.DOMDocument")
'Dim dom As New DOMDocument60
dom.LoadXML File.ResponseText
MsgBox dom.ChildNodes.Length
End Sub