-1

I want to obtain the text from a chat room. And to do that I use the Marshal class to get the string pointer and convert back to a string using Marshal.PtrToStringUni. My target string is written in Vietnamese (UTF-8, codepage Windows-1258). And I could not get it displayed correctly (result shows weird Chinese and symbols). What should I change in the code below to get it right?. Thank you ~

'API Declaration
 Declare Auto Function SendMessage Lib "user32.dll"(ByVal hWnd As IntPtr, ByVal msg As Integer, _
    ByVal wParam As IntPtr, ByVal lParam As IntPtr) As IntPtr
        Declare Auto Function FindWindow Lib "user32.dll" (ByVal lpClassName As String, ByVal lpWindowName As String) As IntPtr

   'Sub to get chat room text
   ' I already got the handle of the chat text area (RoomText)
    Private Sub GetText()
       'get text length
        length = SendMessage(RoomText, WM_GETTEXTLENGTH, 0, 0) + 1
        'Alloc memory for the buffer that receives the text
        Dim Handle As IntPtr = Marshal.AllocHGlobal(length)
       'send WM_GETTEXT message to the chat window
        Dim NumText As Integer = SendMessage(Hwnd, WM_GETTEXT, length, Handle)
       'copy the characters from the unmanaged memory to a managed string
        Dim Text As String = Marshal.PtrToStringUni(Handle)
       'Display the string using a textbox
        TextBox1.AppendText(Text)

    End Sub

Here is the result of the above code: enter image description here

P/S: In other efforts, I tried the SendMessageW, and SendMessageA functions, and only SendMessageA results the string with mix of English and question mark (something like ng?y de?p...). SendMessageW returns weird characters.

  • 1
    [MultiByteToWideChar](https://msdn.microsoft.com/en-us/library/windows/desktop/dd319072.aspx). To understand why, read [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/). – IInspectable Aug 01 '17 at 07:03
  • @IInspectable. Thank you for your hint. – Garena Plus Beta Aug 01 '17 at 07:06
  • 1
    You are calling SendMessageA. Call SendMessageW instead. – David Heffernan Aug 01 '17 at 07:41
  • @DavidHeffernan. It does not work also. I am experimenting MultiByteToWideChar at the moment ^^. – Garena Plus Beta Aug 01 '17 at 08:13
  • You are going in the wrong direction now. But as a general rule I ignore "it doesn't work" comments. – David Heffernan Aug 01 '17 at 08:17
  • 1
    @DavidHeffernan: As I read the question, the *source* string is UTF-8 encoded, so `MultiByteToWideChar` in combination with `SendMessageW` should yield the desired result. Although I'm somewhat confused by the term *"target string"* as well as a codepage alongside a Unicode character encoding. – IInspectable Aug 01 '17 at 08:51
  • 1
    @IInspectable It's pretty clear that the asker doesn't understand what is going on, putting UTF8 next to windows 1258. I've never known WM_GETTEXT yield UTF8 encoded text. Have you? Anyway, in vb.net you'd use Encoding.UTF8.GetString to decode the bytes. – David Heffernan Aug 01 '17 at 09:00
  • 1
    @DavidHeffernan: A user-provided window class *could* handle `WM_GETTEXT` and return UTF-8 encoded text. Although I'd have to agree with you, that the OP has misanalyzed the problem. – IInspectable Aug 01 '17 at 09:05
  • After reading the ref @IInspectable gave. I understood more about the problem. My only confusion at the moment is that: How could I get the byte array of the chat room text. if I used SendMessageW to retrieve the text, how should I declare it to get the chat room text as byte array David ? . What I did was that, I declare a string to get the text from SendMessageW and it returned weird characters. – Garena Plus Beta Aug 01 '17 at 11:40
  • 1
    We need to know if the source window uses Unicode. Could you please run Spy++, select the chat child window, open properties and tell us whether it shows "Unicode" for the window procedure? – zett42 Aug 01 '17 at 12:02
  • 1
    You don't need to perform any extra steps to retrieve the message. Your bug is in properly *interpreting* it, specifically the `Marshal.PtrToStringUni` is wrong. Unless you properly analyze your problem, we cannot tell you what would be correct. – IInspectable Aug 01 '17 at 12:09
  • @zett42: I tried with Spyxx and it showed: Window Proc: 059603D0 (Unicode) (Subclassed) . – Garena Plus Beta Aug 01 '17 at 12:34
  • @IInspectable: Yes, I know the problem is right there. Because the PtrToStringUni would return a UTF-16 encoding text meanwhile the text from chat room is UTF-8. But is that possible to get the byte array of string from the string pointer instead of PtrToStringUni. I searched for this before but could not find any workaround. – Garena Plus Beta Aug 01 '17 at 12:37
  • @DavidHeffernan. Yes, SendmessageW does solve the problem. The reason why I got chinese character is because I did not specify the window title when finding handles. It sounds strange, but in this command: hwndparent=FindWindow("DlgGroupChat Window Class", "Weather around the world - Voice Room"). If I omit the "Weather around the word" to "" for generality ,the code won't work correctly. Thanks all again for your time ! Thanks Illspectable for your ref about encoding : ) – Garena Plus Beta Aug 01 '17 at 13:59
  • 1
    There is no UTF8 here at all – David Heffernan Aug 01 '17 at 14:10
  • 1
    Your problem is the assumption, that the chat window text were UTF-8. If Spyxx reports *"Unicode"*, it is. With very few exceptions you can use Unicode and UTF-16 interchangeably on Windows. – IInspectable Aug 01 '17 at 14:20
  • You are right @IInspectable! – Garena Plus Beta Aug 01 '17 at 23:57

1 Answers1

-1
 'API Declaration
         Public Declare Function SendMessage Lib "user32.dll" Alias "SendMessageW" (ByVal hwnd As Int32, ByVal wMsg As Int32, ByVal wParam As Int32, ByVal lParam As Int32) As Int32
                Declare Auto Function FindWindow Lib "user32.dll" (ByVal lpClassName As String, ByVal lpWindowName As String) As IntPtr

           'Sub to get chat room text
           ' I already got the handle of the chat text area (RoomText)
            Private Sub GetText()
               'get text length
                length = SendMessage(RoomText, WM_GETTEXTLENGTH, 0, 0) + 1
                'Alloc memory for the buffer that receives the text
              'Be noted that Length is the string character count, but Marshal.AllocHGlobal need byte count. 
                'in VB.net string, a character use 2 byte so I put *2
                Dim Handle As IntPtr = Marshal.AllocHGlobal(length*2)
               'send WM_GETTEXT message to the chat window
                Dim NumText As Integer = SendMessage(Hwnd, WM_GETTEXT, length, Handle)
               'copy the characters from the unmanaged memory to a managed string
                Dim Text As String = Marshal.PtrToStringUni(Handle)
               'Display the string using a textbox
                TextBox1.AppendText(Text)

            End Sub
  • 1
    This is the **wrong** signature. It will fail on 64-bit Windows. And it doesn't solve your issue, because you have failed to analyze your issue. I'm afraid, this answer is not useful. -1. – IInspectable Aug 01 '17 at 14:21
  • @IInspectable. I am testing this code on my windows 10,64-bit and it works. Yes, I misunderstood that the text in chat window is UTF-8. What is your point of view on a more general way to obtain the text?....By the way, I don't see any chance for MultiByteToWideChar to be useful in this case. Am I right? – Garena Plus Beta Aug 01 '17 at 23:56