0

This question is related to the question Parse html file using MSHTML in VBScript.

The question tells me how to use an HTMLFILE object.

Dim oHF : Set oHF = CreateObject("HTMLFILE")
oHF.write args
WScript.Echo oHF.title

But when I do the above line and the args (HTML content) has Unicode characters, it becomes junk. DOMManipulation বাংলা ব্লগ became DOMManipulation বাংলা বà§à¦²à¦—.

How do I solve this?

Arnab
  • 2,324
  • 6
  • 36
  • 60
  • 3
    One string became the other *where*? The code you posted will write the value of the (undefined) variable `args` to an `HTMLDocument` object, and nothing else. Please provide a [mcve] demonstrating how you're passing the input value to your VBScript code, how you populate `args`, and how you retrieve the content from the HTML document afterwards. The most likely reason for the garbled output (also known as "mojibake") is an encoding problem. We won't be able to find and fix it without more information, though. – Ansgar Wiechers Aug 24 '19 at 16:46
  • 1
    Check HTML encoding. E.g. `ব` is UTF-8 representation of `ব` _Bengali Letter Ba_ etc… – JosefZ Aug 24 '19 at 18:36
  • @JosefZ Where did you find the info.. I tried html encoder https://mothereff.in/html-entities and utf encoder https://www.browserling.com/tools/utf8-encode but they do not convert to the said char – Arnab Sep 03 '19 at 03:47
  • @AnsgarWiechers I have updated the question with additional line of code where I get the doc title. If this is an encoding issue, I have to get the value of title in a var and deencode the same and then echo the value. But how?Moreover, https://developer.rhino3d.com/guides/rhinoscript/read-write-utf8/ states that The File System Object can read only ASCII or Unicode text files. Is the same true for HTMLFILE object as well. I can't seem to find the documentation of HTMLFILE object. Before doing oHF.write, can I state the encoding of the file is utf 8 as oHF.CharSet = utf-8 – Arnab Sep 03 '19 at 06:03

0 Answers0