3

How do I get all of the innerText from a webpage? - Using the code below only gets the first line, the <p> "Paragraph" tag stops this read. I'd be okay adding vCompanyCity, etc - but can't figure out how to get the next line in this box...

Set ie = CreateObject("InternetExplorer.application")
ie.navigate ("https://www.xxxx.com/")
While ie.Busy Or ie.ReadyState <> 4: DoEvents: Wend
vCompanyAddress(i - 1) = ie.document.all("header-left-box").innerText
more code....
End Sub

Here is the website stuff

wogsland
  • 9,106
  • 19
  • 57
  • 93
Steve Britton
  • 61
  • 1
  • 2
  • 7
  • 1
    Have you tried the `GetElementById` method? e.g. `vCompanyAddress(i - 1) = ie.document.body.getelementbyid("header-left-box").innerText`. Are you sure there aren't multiple lines of text within the string separated by vbCrLF or vbLf characters? –  Mar 12 '17 at 00:14
  • Run-time error '438': Object doesn't support this property or method. Sad faces - However; now that I'm sober - it did/does capture it all, I wrote the results into a cell inside Excel. Expanding the formula bar to see the entire cell shows CRLF inside the cell. It worked all along I was just dumb when I posted the question. – Steve Britton Mar 12 '17 at 19:05

2 Answers2

3

Give something like this a shot. There are some details missing in your post, so I had to make some assumptions.

There are two approaches below:

1) Use getElementByID and see if the InnerText returns

2) Use getElementByID and then iterate the paragraph tags.

Public Sub test()
    Dim ie              As Object
    Dim vCompanyAddress As Variant
    Dim i               As Long: i = 0
    Dim Elements        As Object
    Dim Element         As Object

    ReDim vCompanyAddress(1000) ' Not sure how big this array should be
    Set ie = CreateObject("InternetExplorer.Application")

    With ie
        .navigate ("https://www.xxxx.com/")
        While .Busy Or .ReadyState <> 4: DoEvents: Wend
        'You can try two things, this:
        vCompanyAddress(i) = .document.getElementById("header-left-box").innerText

        'Or you can try this, get the element then create an element
        'collection with all paragraphs tags
        Set Elements = .document.getElementById("header-left-box").getElementsByTagName("p")

        For Each Element In Elements
            vCompanyAddress(i) = Element.innerText
            i = i + 1
        Next
    End With
End Sub
Ryan Wildry
  • 5,612
  • 1
  • 15
  • 35
0

Here is a sample of the method I used to find all the tags I needed to scrape, click and send information from a webpage to myself as well as logging in and doing other routine processes like entering information instead of having to do it manually. With this you can build more complex proceedures using instr statements to figure out what you want to search for, click, etc without having to actually do anything manually so that, for example, if microsoft or some other company keeps changing the java script button tag you can use text strings and offsets to click the button no matter what its name/tag is.

Sub BIG_EskimoRoll()

On Error Resume Next
    Dim ExpR As Object
    Set ExpR = ie.Document.getElementsByTagName("p")

    i = 0
    While i < ExpR.Length
        If ExpR(i).Name <> "" Then
            If ExpR(i).className = "expireFeature" Then msg = ExpR(i).innerText
            
            ''''this is good code to keep around'''''''''''
            'If ExpR(i).className = "expireFeature" Then Debug.Print ExpR(i).className
            'If ExpR(i).className = "expireFeature" Then Debug.Print ExpR(i).innerText
            ' Set text for search
            Debug.Print "Ptsn: " & i
            Debug.Print "Nm: " & ExpR(i).Name
            Debug.Print "Type: " & ExpR(i).Type
            Debug.Print "Vl: " & ExpR(i).Value
            Debug.Print "ID: " & ExpR(i).ID
            Debug.Print "inTxt: " & ExpR(i).innerText
            Debug.Print "inHTML: " & ExpR(i).innerHTML
            Debug.Print "outHTML: " & ExpR(i).outerHTML
            Debug.Print "cNm: " & ExpR(i).className
            Debug.Print "tNm: " & ExpR(i).tagName
            Debug.Print "href: " & ExpR(i).href
            
        End If
        i = i + 1
    Wend

End Sub
jbay
  • 126
  • 9