-2

I have the following code which gets the data from the textbox input ( pure winapi )

BOOL CALLBACK DlgProc(HWND hw, UINT msg, WPARAM wp, LPARAM lp) {
    switch (msg)
    {
    case WM_INITDIALOG:
        SendDlgItemMessage(hw, IDC_EDITMASK, EM_SETLIMITTEXT, 512, 0);
        return true;

        case WM_CLOSE:
            DestroyWindow(hw);
            return TRUE; 


        case WM_COMMAND:
            HWND hCtrl;
            int length;
            wchar_t * text;
            switch (LOWORD(wp))
            {

                case IDCPROCESS:

                    nElements = 1;
                    hCtrl = GetDlgItem(hw, IDC_EDITMASK); 

                    length = GetWindowTextLengthW(hCtrl);        
                    if (length == 0) {
                        MessageBox(hw, L"Неверная маска", L"Ошибка", 0);
                        return FALSE;
                    }

                    text = (wchar_t*)HeapAlloc(hProcessHeap, HEAP_ZERO_MEMORY, length * sizeof(wchar_t) + sizeof(wchar_t));

                    GetWindowTextW(hCtrl, text, length + sizeof(wchar_t));

                    char *test = (char*)text;

                    int pos = 0;
                    int startPos = 0;
                    char dbg[2] = { 0 };
                    while (pos <= length) {
                        dbg[0] = text[pos];
                        OutputDebugStringA(dbg); // here i output the text by characters
                        if (text[pos] == ',' || pos == length) {

                            if(!szMasks)
                                szMasks = (wchar_t**)HeapAlloc(hProcessHeap, HEAP_ZERO_MEMORY, sizeof(wchar_t*)*nElements);
                            else
                                szMasks = (wchar_t**)HeapReAlloc(hProcessHeap, HEAP_ZERO_MEMORY,szMasks, sizeof(wchar_t*)*nElements);

                            int bufferSize = pos - startPos;
                            szMasks[nElements - 1] = (wchar_t*)HeapAlloc(hProcessHeap, HEAP_ZERO_MEMORY, (bufferSize + 2) * sizeof(wchar_t));
                            if(bufferSize % sizeof(wchar_t) != 0)
                                bufferSize++;
                            int copyLength = bufferSize / sizeof(wchar_t);

                            wcsncpy(szMasks[nElements - 1], text + startPos, copyLength);
                            OutputDebugStringW(szMasks[nElements - 1]);
                            OutputDebugStringW(L"\r\n");

                            nElements++;
                            startPos = pos+1;

                        }

                        pos++;
                    }

                    searchMasks.count = nElements-1;
                    searchMasks.szMasks = szMasks;

                    HeapFree(hProcessHeap, 0, text);
                    DestroyWindow(hw);

                    return TRUE;


            }
            break;



    case WM_DESTROY:
            PostQuitMessage(0);
            return TRUE;
    }

    return FALSE;
}

So if i enter russian text for example, i get the valid wide string, everything is ok. If i switch to english, and input let's say "word", i get the buffer in the text that it's not formed as a wide string, i expect it to be : "w\0o\0r\0d" but i get "word"

But i get a regular char* string instead, which is really bad because i need to parse the text by some rule, searching for the character ',' and copying data to the other buffer according to it, using wcsncpy, so i must always have a proper formatted wchar_t* string. Is there any way to deal with this, and why does GetWindowTextW doesn't form a proper wide string? I'am compiling my project using UNICODE character set, and not multibyte.

UPDATED THE CODE

char * test = (char*) text Give a valid ansi string, if input only latin chracters in the input box, not a proper formatted wchar_t*

Vlad
  • 369
  • 4
  • 16
  • 1
    How are you looking at `text` that you can tell it's not `w\0o\0r\0d`? – andlabs Aug 16 '16 at 18:28
  • We don't know what *hCtrl* is. Please provide all required information. In this case: Does the handle reference a control? In the same process or another process? Is it a standard control or a custom control? In case it is a custom control, show the code for it as well. – IInspectable Aug 16 '16 at 18:28
  • @andlabs i output in a for loop – Vlad Aug 16 '16 at 18:30
  • 1
    Show the for loop too, please. – andlabs Aug 16 '16 at 18:32
  • @yasofiz I can assure you GetWindowTextW does exactly what it is documented to do, and exactly how it is documented to do it. Include your *exact* code for "output in a for loop" in the code of your question. Not something "like" your for loop. The *exact* loop, with all proper declarations of all used variables included, and that includes the decl for `text` and `length` as well. Also include the window class name of the control you're fetching text from. – WhozCraig Aug 16 '16 at 18:32
  • @WhozCraig i updated my code – Vlad Aug 16 '16 at 18:40
  • @andlabs updated the code – Vlad Aug 16 '16 at 18:40
  • @IInspectable i updated the code – Vlad Aug 16 '16 at 18:40
  • I also asked a number of questions. You didn't answer any of them. – IInspectable Aug 16 '16 at 18:51
  • @IInspectable it is the same process. The handle reference to the control, its not a custom control – Vlad Aug 16 '16 at 18:53
  • @IInspectable it is a `Edit control` from the visual studio 2015 resource builder. Which is placed In a dialog – Vlad Aug 16 '16 at 18:55
  • Not that it makes a difference, but shouldn't `GetWindowTextW(hCtrl, text, length + sizeof(wchar_t));` really be `... + 1);`? Besides, don't ignore compiler warnings. You are getting compiler warnings. Please fix them. – IInspectable Aug 16 '16 at 19:01
  • @IInspectable using the safe-versions of the wide char functions? – Vlad Aug 16 '16 at 19:04
  • The warning I had in mind was the one telling you, that you are potentially skipping initialization of local variables in your switch-case statement. I'm sure you are getting other warnings, too. – IInspectable Aug 16 '16 at 19:06

3 Answers3

3

A C-style string is a sequence of characters, terminated by a NUL character. Anything from the first NUL character is not considered part of the string.

When you call OutputDebugStringA with an argument of type char[2], where the first element is an ASCII character and the second character is \01 it is interpreted as a string of length 1. Consequently, you are printing the ASCII characters only.

You are dealing with wide character strings. Your logic to deduce the string type is wrong.


1 That's how a UTF-16LE encoded ASCII character will be stored in your given scenario.
IInspectable
  • 46,945
  • 8
  • 85
  • 181
  • I checked in the debugger, every time the `dbg[0]` is not a `\0`. So wor the text "word" i get exactly 4 iterations – Vlad Aug 16 '16 at 18:51
  • @yasofiz: Of course `dbg[0]` is different from `0` when dealing with ASCII characters, encoded as UTF-16LE. It's `dbg[1]` that's `0`. And yes, it takes 4 iterations to iterate over 4 code units. Why does this surprise you? – IInspectable Aug 16 '16 at 18:55
  • 1
    i see now i messed up big time. – Vlad Aug 16 '16 at 18:59
2

Your text variable is a wchar_t pointer (even though the definition isn't shown), so of course any attempt to display it will show whole UTF-16 characters. You'll only get the embedded \0 characters if you're inspecting a char * buffer, as it will break every wchar_t unit into multiple pieces.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • @yasofiz that code is nearly impossible to follow. I see you're declaring a `char * test` but I don't see anywhere that you're using it. – Mark Ransom Aug 16 '16 at 18:48
  • I placed it there to test, if the `text` would be a proper wide formatted. i can see that char * test value is "word", which means that it is not w\0o\0r\0d like i expect it to be – Vlad Aug 16 '16 at 18:50
0

Based on the updated code ...

When you call OutputDebugStringA for one of the bytes that should be zero, you'll see no output. You've effectively printed an empty string. So it will appear as if the zero bytes aren't there, but they are.

Adrian McCarthy
  • 45,555
  • 16
  • 123
  • 175
  • But i see in the debugger that `dbg[0]` is not zero, and the counter stops at exactly `4` for the text "word" – Vlad Aug 16 '16 at 18:45
  • also you see in my code a line `char *test = (char*)text;` if i input a text "word" and break point this, line i can see that `char * test` value is "word", which means that it is not `w\0o\0r\0d` like i expect it to be – Vlad Aug 16 '16 at 18:48