0

Am having problems converting from ANSI to Unicode and back. The following code snip describes what I am doing. I am getting 0x57 errors..

DECLARE DYNAMIC LIBRARY "kernel32"
    FUNCTION MultiByteToWideChar& (codePage~&, dwFlags~&, lpszMbstring$, byteCount&, lpwszWcstring$, wideCount&)
    FUNCTION WideCharToMultiByte& (codePage~&, dwFlags~&, lpWideString$, BYVAL ccWideChar%, lpMultiByte$, BYVAL multibyte%, BYVAL defaultchar&, BYVAL usedchar&)
    FUNCTION GetLastError& ()
END DECLARE
DIM Filename AS STRING * 260, NewFilename AS STRING * 260, MultiByte AS STRING * 260
PRINT "Enter filename";: INPUT Filename$: 'Filename$ = Filename$ + CHR$(0)
x = MultiByteToWideChar(0, 0, Filename$, LEN(Filename$), NewFilename$, 260)
IF x = 0 THEN
    PRINT "Error 0x"; HEX$(GetLastError)
ELSE
    PRINT "Processing: "; NewFilename$
END IF
' do unicode stuff here
x = WideCharToMultiByte(65001, 0, NewFilename$, LEN(NewFilename$), MultiByte$, 0, 0, 0)
' display processed filename
IF x = 0 THEN
    PRINT "Error 0x"; HEX$(GetLastError)
ELSE
    PRINT MultiByte$
END IF
eoredson
  • 1,167
  • 2
  • 14
  • 29

1 Answers1

1

Some more args need to be passed with the BYVAL keyword:

FUNCTION MultiByteToWideChar& (BYVAL codePage~&, BYVAL dwFlags~&, lpszMbstring$, BYVAL byteCount&, lpwszWcstring$, BYVAL wideCount&)
FUNCTION WideCharToMultiByte& (BYVAL codePage~&, BYVAL dwFlags~&, lpWideString$, BYVAL ccWideChar%, lpMultiByte$, BYVAL multibyte%, BYVAL defaultchar&, BYVAL usedchar&)

Aside from that, the length of STRING * 260 is always 260, regardless of any value stored. This means Filename = Filename + CHR$(0) won't work as intended, not that either of MultiByteToWideChar or WideCharToMultiByte require null-terminated input (that's why the byteCount and ccWideChar params exist; sometimes you only want to operate on a part of a string).

Worse, even if you use _MEMFILL to set all bytes of Filename to 0 to allow you to deal with things using ASCIIZ strings, INPUT and LINE INPUT will fill any remaining bytes not explicitly entered into Filename with CHR$(32) (i.e. a blank space as if you pressed the spacebar). For example, if you enter "Hello", there would be 5 bytes for the string entered and 255 bytes of character code 32 (or &H20 if you prefer hexadecimal).

To save yourself this terrible headache ("hello world.bas" is a valid filename!), you'll want to use STRING, not STRING * 260. If the length is greater than 260, you should probably print an error message. Whether you allow a user to enter a new filename or not after that is up to you.

You'll also want to use the return value of MultiByteToWideChar since it is the number of characters in NewFilename:

DIM Filename AS STRING
DIM NewFilename AS STRING * 260
DIM MultiByte AS STRING * 260
...

' Note: LEN(NewFilename) = 260 (**always**)
' This is why the number of wide chars written
' is saved.
NewFilenameLen = MultiByteToWideChar(0, 0, Filename, LEN(Filename), NewFilename, LEN(NewFilename))

...

' Note: LEN(MultiByte) = 260 (**always**)
x = WideCharToMultiByte(65001, 0, NewFilename, NewFilenameLen, MultiByte, LEN(MultiByte), 0, 0)

...
  • Ok, thanks again. That should just about do it for awhile as I piece all the code together.. – eoredson Sep 06 '17 at 05:21
  • BTW: Why does FindFirstFileW return .cAlternateFilename as NUL? – eoredson Sep 08 '17 at 00:36
  • 1
    The docs state that a DOS 8.3 format filename will be in `cAlternateFilename` unless `cFilename` is already an 8.3 filename, in which case `cAlternateFilename` is an empty string. For example, `foo.txt` would result in an empty `cAlternateFilename` member while `HelloWorld.txt` and `foo.config` might result in `HelloW~1.txt` and `foo~7.con`. –  Sep 08 '17 at 00:45
  • 1
    _If a file has a long file name, the complete name appears in the `cFileName` member, and the 8.3 format truncated version of the name appears in the `cAlternateFileName` member. Otherwise, `cAlternateFileName` is empty._ - from "Remarks" section of [`WIN32_FIND_DATA` structure docs](https://msdn.microsoft.com/en-us/library/windows/desktop/aa365740(v=vs.85).aspx) –  Sep 08 '17 at 00:54
  • 1
    The reason why .cAlternateFilename was not being returned correctly is because the .cFilename in Win32_Find_DataW must be 520 bytes to support the 2-byte pairs of an extended ASCIIZ. – eoredson Sep 12 '17 at 02:22
  • 1
    @eoredson Ah, yes. I suppose I've grown too used to Windows and Linux C programming with their Unicode-based `wchar_t`. If I used it enough, I'd fork QB64 and add support for Unicode strings, but I almost never use it except for running some QuickBASIC code. Good catch –  Sep 12 '17 at 03:23
  • And the .cAlternateFilename must be doubled to 28 char in the Win32_Find_DataW as well.. – eoredson Sep 18 '17 at 01:46