I am trying to get a VBA webscraping tool to pull the data from Zillow for all sold properties within the last 24 months and paste it in an analytical format. The issues I am running into are:
1- the data is pulled as HTML format so it is stuffed into a single cell. What I have tried A-tried first to do a blanket convert to colluom function but the lengths are all variable and it will also sometimes cut off anything past the address. B- tried to copy and paste to a .txt file but the path kept breaking somewhere and not all the data made it.
2- clicking the "Next page" button. What I have tried A-searching href tags to create next url, the keep coming back as "Null" and break the chain B- using Css selector to pick the a tag with title "Next page". I think this is working, but it just keeps clicking the pages without downloading the data after page 1.
Bonus points if you can: -Also download the link href. See point 2 as I keep getting null whenever I try to get the href.
Make it so the base url pulls from a cell on sheet1 so I can make it more dynamic. Whenever I try it either tells me a specific range is not constant or the object is not valid.
Any help or clarity on what I am doing wrong would be helpful! Code:
Option Explicit
Sub GetZillowSold()
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim ListCards As MSHTML.IHTMLElementCollection
Dim InfoCard As MSHTML.IHTMLElement
Dim SoldList As MSHTML.IHTMLElementCollection
Dim SoldDate As MSHTML.IHTMLElement
Dim Zpages As MSHTML.IHTMLElementCollection
Dim Zpage As MSHTML.IHTMLElement
Dim CardID As Integer
XMLReq.Open "GET", "https://www.zillow.com/the-colony-tx/sold/house_type/3-_beds/", False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
Exit Sub
End If
Worksheets.Add
Range("A1").Value = "Address"
Range("b1").Value = "Price"
Range("c1").Value = "Bedroom"
Range("d1").Value = "Bath"
Range("e1").Value = "Sqft"
Range("f1").Value = "Date"
Range("A2").Select
HTMLDoc.body.innerhtml = XMLReq.responseText
Set XMLReq = Nothing
Set ListCards = HTMLDoc.getElementsByClassName("list-card-info")
'Debug.Print ListCards.Length
For Each InfoCard In ListCards
ActiveCell.Value = InfoCard.innerText
ActiveCell.Offset(1, 0).Select
Next InfoCard
Range("f2").Select
Set SoldList = HTMLDoc.getElementsByClassName("list-card-top")
'Debug.Print SoldList.Length,
For Each SoldDate In SoldList
ActiveCell.Value = Mid(SoldDate.innerText, 6)
ActiveCell.Offset(1, 0).Select
Next SoldDate
Set Zpages = HTMLDoc.getElementsByTagName("a")
For Each Zpage In Zpages
If (Zpage.getAttribute("title") = "Next page") Then
Zpage.Click
End If
Exit For
Next Zpage
End Sub