2

This is in Word for MAC VBA. I want to save the Unicode character from a text box to text file. For example this character "⅛".

I use this code.

Dim N as Long
N = FreeFile
Dim strText as String
strText = Textbox1.Text 'This is what is in the textbox "⅛"
Open <file path> For Output As N 
     Print #N, strText
Close N

It does not save the Unicode character. I understand I have to change the text encoding format. How do I do that?

Likewise, how to read the text file with the Unicode format?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Codename K
  • 890
  • 4
  • 23
  • 52

4 Answers4

11

I hope this will fit VBA for Word on Mac as well, but on Windows I have the CreateTextFile method of the FileSystemObject (see MSDN doc). There I can define to create a unicode text file.

  Set fsObject = CreateObject("Scripting.FileSystemObject")
  Set xmlFile = fsObject.CreateTextFile("path/filename.txt", True, True) 'the second "true" forces a unicode file.

  xmlFile.write "YourUnicodeTextHere"
  xmlFile.close
The Conspiracy
  • 3,495
  • 1
  • 17
  • 18
  • This way you can create a file with unicode coding, but **not in UTF-8** with byte order mark and so on. UTF-8 is not only unicode - see [wikipedia](http://en.wikipedia.org/wiki/Utf-8). The only way in VBA, what I know, is with ADODB. – Roman Plischke Nov 14 '13 at 07:43
  • OP asked for Unicode. He did NOT ask for UTF-8 with byte order mark. – The Conspiracy Nov 16 '13 at 19:45
  • Oops, you are right. I've edited my answer for Unicode coding. Results are equal. - However, ADO method has more possibilities in coding, see my edited answer. – Roman Plischke Nov 18 '13 at 07:39
7

VBA can't code text in UTF-8 this way. Use ADODB - yes, for text, not for database.

'ensure reference is set to Microsoft ActiveX DataObjects library
'(the latest version of it) under "tools/references"
Sub AdoTest()
    Dim adoStream As ADODB.Stream
    
    Set adoStream = New ADODB.Stream
    
    'Unicode coding
    adoStream.Charset = "Unicode" 'or any string listed in registry HKEY_CLASSES_ROOT\MIME\Database\Charset
    
    'open sream
    adoStream.Open
    
    'write a text
    adoStream.WriteText "Text for testing: ěšč", StreamWriteEnum.stWriteLine
    
    'save to file
    adoStream.SaveToFile "D:\a\ado.txt"
    
    adoStream.Close
End Sub

Reading is simplier, see my answer here:

Unicode and UTF-8 with VBA

Edited: I've inserted complete example.

Edited 2: Added refernce to list of coding in the registry

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
Roman Plischke
  • 1,042
  • 7
  • 14
3

The question is for VBA on Mac, and I'm afraid none of the answers work on a Mac.

The question is about Unicode which comes in many flavours. I'll address the UTF-16 aspect of it. UTF-8 follows a different path, but it isn't difficult too. AFAIU, your question is about UTF-16 string.

The code below has no error handling, I'll let you take care of that.

Function writeUnicodeTextToFile(filePathName As String, myText As String)

`Dim myFileNumber As Long, I As Long, byteArray() As Byte

myFileNumber = FreeFile()
Open filePathName For Binary As #myFileNumber

ReDim byteArray(1)

' Create a BOM for your Unicode flavour
' (CHOOSE! one of the two, programmatically, or hard-code it)
 ' => Little Endian
    byteArray(0) = 255: byteArray(1) = 254
' => Big Endian
    'byteArray(0) = 254: byteArray(1) = 255

' now write the two-byte BOM
Put myFileNumber, 1, byteArray

' redimension your byte array
' note it works even if you don't Redim (go figure) but it's more elegant
I = (LenB(myText) / 2) - 1
ReDim byteArray(I)

' populate the byte array...
byteArray = myText

' ... and write you text AFTER the BOM
Put myFileNumber, 3, byteArray
Close #myFileNumber
End Function
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
  • Hi, can you just briefly explain what needs to be adapted for UTF-8 export? Thx – zdusper Jan 25 '21 at 09:32
  • It's more complicated. You need a VBA function to convert from plain text to UTF-8. Once that is done, you must first write a 3-byte BOM, then write data to disk as 8-bit text (ANSI). – Yves Champollion Jan 26 '21 at 12:38
  • @PanagiotisKanavos Since those are unavailable on a mac, perhaps, as the answerer specified – Erik A Jul 02 '21 at 15:11
1

Here is a VBA routine that takes a string as input (your text), and fills an array of bytes. Then you write that array to disk in binary mode, making sure you start writing it after the first three bytes (BOM).

You'll need those Public variables: byteArray() As Byte, regexUTF8 As String

Sub testing()

' creating the BOM

Dim bom(2) As Byte, someFile As Long

bom(0) = 239: bom(1) = 187: bom(2) = 191

' Writing something as utf-8 UTF16toUTF8 "L'élève de l'école"

someFile = FreeFile() Open "MacDisk:test.txt" For Binary As #someFile ' first, the BOM Put #someFile, 1, bom ' then the utf-8 text Put #someFile, 4, byteArray1 Close #someFile End Sub

Sub UTF16toUTF8(theString As String)

' by Yves Champollion ' Transforms a VB/VBA string (they're all 16-bit) into a byteArray1, utf-8 compliant

    If isStringUTF8(theString) Then Exit Sub

    Dim iLoop As Long, i As Long, k As Long

    k = 0
    ReDim byteArray1(Len(theString) * 4)
    For iLoop = 1 To Len(theString)
        i = AscW(Mid$(theString, iLoop, 1))
        If i < 0 Then i = i + 65536
        If i > -1 And i < 128 Then
            byteArray1(k) = i
            k = k + 1
        ElseIf i >= 128 And i < 2048 Then
            byteArray1(k) = (i \ 64) Or 192
            byteArray1(k + 1) = (i And 63) Or 128
            k = k + 2
        ElseIf i >= 2048 And i < 65536 Then
            byteArray1(k) = (i \ 4096) Or 224
            byteArray1(k + 1) = ((i \ 64) And 63) Or 128
            byteArray1(k + 2) = (i And 63) Or 128
            k = k + 3
        Else
            byteArray1(k) = (i \ 262144) Or 240
            byteArray1(k + 1) = (((i \ 4096) And 63)) Or 128
            byteArray1(k + 2) = ((i \ 64) And 63) Or 128
            byteArray1(k + 3) = (i And 63) Or 128
            k = k + 4
        End If
    Next

    ReDim Preserve byteArray1(k - 1)

End Sub

Function isStringUTF8(theString As String) As Boolean

    Dim i As Integer, j As Integer, k As Integer

    ' Prime the regex argument
    If Len(regexUTF8) <> 66 Then
        regexUTF8 = "*[" + Space$(62) + "]*"
        For i = 192 To 253
            Mid(regexUTF8, i - 189, 1) = Chr(i)
        Next
    End If

    ' First quick check: any escaping characters?
    If Not theString Like regexUTF8 Then Exit Function

    'longer check: are escaping characters followed by UTF-8 sequences?
    For i = 1 To Len(theString) - 3
        If Asc(Mid(theString, i, 1)) > 192 Then
            k = Asc(Mid(theString, i, 1))
            If k > 193 And k < 220 Then
                If (Asc(Mid(theString, i + 1, 1)) And 128) Then
                    isStringUTF8 = True
                    Exit Function
                End If
            End If
            If k > 223 Then
                If (Asc(Mid(theString, i + 1, 1)) And 128) And (Asc(Mid(theString, i + 2, 1)) And 128) Then
                    isStringUTF8 = True
                    Exit Function
                End If
            End If
            j = j + 1
            If j > 100 Then Exit For
        End If
    Next
End Function