Is it safe to ignore the potential inaccuracy of WM_GETTEXTLENGTH for Edit Controls?

Question

When the WM_GETTEXTLENGTH message is sent, the DefWindowProc function returns the length, in characters, of the text. Under certain conditions, the DefWindowProc function returns a value that is larger than the actual length of the text. This occurs with certain mixtures of ANSI and Unicode, and is due to the system allowing for the possible existence of double-byte character set (DBCS) characters within the text.

I assume none of these conditions can occur in a WCHAR edit control, since it's WCHAR only.

I feel like just ignoring them and using WM_GETTEXTLENGTH directly as my length. If it happens that WM_GETTEXTLENGTH doesn't just do a strlen operation to get its value, and instead has it precomputed, that would be good for performance. I guess I will have to look at the decompilation of notepad in ghidra to find out.

Since any potential inaccuracy will only cause you to allocate a larger buffer, it's safe to ignore it. — Michael Chourdakis, Jul 14 '23 at 20:38
@MichaelChourdakis The thing is, I want to use the text data directly via EM_GETHANDLE, so I won't be calling WM_GETTEXT. — user363406, Jul 14 '23 at 20:39
yes, as @MichaelChourdakis points out, this is used to allocate the buffer so you can follow with a WM_GETTEXT. Once you read the string in, however, you should use a `strlen` method if you need. — Garr Godfrey, Jul 14 '23 at 20:39
so what do you need the length for? depends on how you use it. — Garr Godfrey, Jul 14 '23 at 20:41
@user363406 why do you want to use the data directly? You can't use it in a richedit and you can't change it. The performance gain is really negligible. Most edit controls contain small text that you can afford to store in the stack anyway. — Michael Chourdakis, Jul 14 '23 at 20:41
@GarrGodfrey I need the length because I'm writing a find algorithm, which needs to avoid testing chars outside the range of the text buffer. — user363406, Jul 14 '23 at 20:43
@MichaelChourdakis Trust me, it's not negligible. On a large file like no$gba's gbatek.txt, copying text in every time to do a find is slow. And I don't want to maintain a parallel buffer. — user363406, Jul 14 '23 at 20:44
in theory, some characters require 3 or 4 bytes, so a similar problem might exist for WCHAR even if it isn't obviously documented. It's probably unlikely, and if so, you probably just have extra NULLs at the end. but you might really want to find that first NULL — Garr Godfrey, Jul 14 '23 at 20:50
@GarrGodfrey Yeah I think I'm just going to wstrlen the buffer I get from EM_GETHANDLE. — user363406, Jul 14 '23 at 20:50
i suggest you profile WM_GETTEXTLENGTH vs strlen on the same text and see if WM_GETTEXTLENGTH actually saves any time — Garr Godfrey, Jul 14 '23 at 20:51
`WM_GETTEXTLENGTH` returns the text length necessary to allocate buffer for the following `WM_GETTEXT` (or `GetDlgItemText()` call). If it is larger than real text length, this is not a big problem. — i486, Jul 16 '23 at 20:27

user363406 · Answer 1 · 2023-07-29T04:36:56.520

Well, I still don't know if it's that safe to rely on the length returned by WM_GETTEXTLENGTH, but it definitely is way faster than doing wcslen on the pointer you get from WM_GETHANDLE+LocalLock.

From a benchmark of a 4107893 char file. 
Time is total seconds taken to compute the length 512 times:

WM_GETTEXTLENGTH:
    Length: 4107893
    Time: 0.000058
wcslen:
    Length: 4107893
    Time: 0.623284

However, the performance of WM_GETTEXT/wcslen is probably not a problem. I realized that the slowness my app experiences is actually the same that notepad experiences, and it occurs when I call WM_REPLACESEL on a large document. For some reason, replacing any amount of text in a large document with that message is pretty slow.

Is it safe to ignore the potential inaccuracy of WM_GETTEXTLENGTH for Edit Controls?

1 Answers1