1

I'm trying to strip HTML from a string and found two methods in this SO thread.

The code for the first answer works but uses late binding.

With CreateObject("htmlfile")
    .Open
    .write "<p>foo <i>bar</i> <u class='farp'>argle </zzzz> hello </p>"
    .Close
    MsgBox "text=" & .body.outerText
End With

The code for the alternative answer, which uses early binding, gives a compile error ("Function or interface marked as restricted, or the function uses an Automation type not supported in Visual Basic").

Public Function StripHtml(inputHtml As String) As String
    With New HTMLDocument
        .Open
        'Following line gives compile error
        .write "<p>foo <i>bar</i> <u class='farp'>argle </zzzz> hello </p>"
        .Close
       StripHtml = .body.outerText
    End With
End Function

My questions:

  1. Is the alternative answer simply not the equivalent?
  2. Is there an early binding equivalent to the first answer, which works?
  3. Why does CreateObject("htmlfile") work at all when I cannot find that object type in the object browser?
RobertSF
  • 488
  • 11
  • 24
  • "answers 1 and 4" - the sorting depends on who's viewing them and how they're sorting answers. "1 and 4" means pretty much nothing. If you mean to link to specific SO posts, use the appropriate [share] links under each. – Mathieu Guindon Oct 18 '17 at 20:45
  • @Mat'sMug Thanks. I didn't know that. I did link to a specific SO post, and I included the code snippets from the answers, so I'll edit the question. – RobertSF Oct 18 '17 at 21:57

1 Answers1

1

these two are equivalent, i think

Option Explicit

Sub Macro1()

    Dim aaa As Object
    Set aaa = CreateObject("htmlfile")
    aaa.Open
    aaa.write "<p>foo <i>bar</i> <u class='farp'>argle </zzzz> hello </p>"
    aaa.Close
    Debug.Print "text=" & aaa.body.outerText

' ---------------------------------------------------

    Dim bbb As New HTMLDocument
    bbb.body.innerHTML = "<p>foo <i>bar</i> <u class='farp'>argle </zzzz> hello </p>"
    Debug.Print "text=" & bbb.body.outerText

End Sub

your question about CreateObject("htmlfile") should read "why does With CreateObject("htmlfile") work..."

the with command creates an temporary, unnamed object (it is similar to the lambda function that is used in some languages)

if you need to examine it, then use a set command to create a variable that references it

jsotola
  • 2,238
  • 1
  • 10
  • 22
  • Thank you! The trick seems to be not using `.Open`. Then the object will work as a HTML -> text converter. To clarify, I know about `with`. What I don't understand is why `"htmlfile"` is a valid object name. I thought the argument to `CreateObject` had to be the name of an object that VBA knows about, but I don't find it with the object browser. – RobertSF Oct 19 '17 at 02:28
  • 1
    found it ... it is a windows COM object, so windows knows about it .... and by default, so does VBA (i think) ... some info here https://autohotkey.com/board/topic/56987-com-object-reference-autohotkey-v11/ .......... probably would work for any registered dll file – jsotola Oct 19 '17 at 03:07
  • They are not equivalent if the HTML string you are trying to pass has a "head" section. If you pass such string to `body.innerHTML`, the head part won't be added to the HTMLDocument object. – DecimalTurn Aug 23 '20 at 23:29