12

I need to compute the width of a column with many rows (column AutoSize feature). Using Canvas.TextWidth is far too slow.

Current solution: My current solution uses a text measurer class that builds a lookup table for a fixed alphabet once and then computes the width of a given string very fast by adding up character widths retrieved from the lookup table. For characters not contained in the lookup table, the average character width is used (also computed once).

Problem: This works well for European languages but not for Asian languages.

Question: What's the best way to tackle this problem? How can such an AutoSize feature be realized without the relatively slow Canvas functions and without depending on a specific alphabet?

Thanks for any help.

Bruce McGee
  • 15,076
  • 6
  • 55
  • 70
jpfollenius
  • 16,456
  • 10
  • 90
  • 156

3 Answers3

9

You said you want to get the maximum text width for a column. Can't you, say, take only the 4 or 5 longest strings and get their widths? That way you won't have to find the width for all items and can save quite some time.

Or you use your cache to find the rough length of the strings and then refine that by getting the actual width for the top 4 or 5 items you found.

I don't think it matters a lot whether you use Canvas.TextWidth or GetTextExtentPoint32. Just use one of these to get the exact widths, after you used one of the methods above to guesstimate the longest/widest strings.

To those who think this doesn't work

If the poster of the original question thinks it could work, I have no reason to think it won't. He knows best what kind of strings can be in the columns he has.

But that is not my main argument. He already wrote that he does a preliminary textwidth by adding the predetermined individual widths of the characters. That does not take into account any kerning. Well, kerning can only make a string narrower, so it still makes sense to check only the top 4 or 5 items for the exact width. The biggest problem that can occur is that the column could be a few pixels too wide, no more. But it will be a lot faster than using TextWidth or GetTextExtentPoint32 or similar functions on each entry (assuming more than 5 entries), and that is what the original poster wanted. I suggest that those who don't believe me simply try it out.

As for using the pure string length: even that is probably good enough. Yes, 'WWW' is probably wider than '!!!!!', but the original poster will probably know best wat kind of string material he has, and if it is feasible. '!!!!!' or 'WWW' are not the usual entries one expects. Especially if you consider that not only one single string is checked, but the longest 4 or 5 strings (or whatever number turns out to be optimal). It is very unlikely that the widest string is not among them. But the original poster can tell if that is possible or feasible. He seems to think it is.

So stop the downvoting and try it out for yourself.

Rudy Velthuis
  • 28,387
  • 5
  • 46
  • 94
  • That's very poor. Is "!!!!!" longer or shorter than "WWW" in your font? You cannot use the number of characters to short-cut the width of text, particularly with kerning and the sub-pixel adjustments done by Windows. – mj2008 Sep 27 '11 at 10:42
  • @mj2008: there might be some extreme cases where it does not work, but I think for real-word values it should work pretty well. Note that the suggestion is not to only measure the longest string, but to measure the 4-5 longest strings. Chances are pretty good that you catch the "metrically" longest string. – jpfollenius Sep 27 '11 at 11:13
  • @mj2008: That is why I said he should take the top 4 or 5, not just the "longest". And if his method has all indivdual character widths cached anyway, he has a rough estimate already. It just needs some refinement. – Rudy Velthuis Sep 27 '11 at 14:23
  • @lkessler: see my update. I am pretty sure it will work, or at least work good enough for what the original poster wants. But it is not quite sure what you mean with "it does matter". What does? The method to get the text width? I doubt it matters, for only 4 or 5 entries. Apparently the original poster has quite a few more than 5, otherwise speed would not be an issue. – Rudy Velthuis Sep 28 '11 at 02:27
  • @Rudy: Good for me now. +1 See my comment under your comment in mj2008's answer. – lkessler Sep 28 '11 at 04:27
5

I'm afraid you have to use Canvas.TextWidth, or your implementation will be imprecise. The width of text depends on the font kerning, where different character sequences may have different widths (not just the total of individual character widths).

lkessler
  • 19,819
  • 36
  • 132
  • 203
Ondrej Kelle
  • 36,941
  • 2
  • 65
  • 128
  • The lookup table approach worked okay for me, so it seems that with our font and values the precision is good engouh. Thanks for the hint though! I definitely prefer speed over precision here. – jpfollenius Sep 27 '11 at 09:19
  • I'm not 100% sure about this but I think there are some complex scripts which lay out some combinations of adjacent characters into different glyphs, so a character-based lookup in principle won't help you, I'm afraid. – Ondrej Kelle Sep 27 '11 at 09:28
  • 1
    Your comment "The lookup table approach worked okay for me" contradicts your statement: "Problem: This works well for european languages but not for asian languages." – Ondrej Kelle Sep 27 '11 at 09:31
  • I should have said "worked okay for me...until I discovered the problem with asian languages" :) What I meant is that the precision was quite okay. – jpfollenius Sep 27 '11 at 09:33
  • Actually not wrong. See Rudy and my comments in mj2008's answer. I edited and rolled back your answer so I could change the -1 to +1. – lkessler Sep 28 '11 at 04:32
3

Me, I cut out the middle-man and use the Windows API directly. Specifically, I use GetTextExtentPoint32 with the .Handle of the Canvas. There's nothing you can do to be faster, other than caching results in some way, and frankly you'll just add overhead.

mj2008
  • 6,647
  • 2
  • 38
  • 56
  • Is that faster than calling `Canvas.TextWidth`? And by faster I mean considerably faster...not just a function call less or something? – jpfollenius Sep 27 '11 at 09:21
  • Why would this be faster than `TCanvas.TextWidth`? All the overhead is in the Win32 code. The thin VCL layer is insignificant surely. – David Heffernan Sep 27 '11 at 09:21
  • This doesn't make any sense, as David says. – Andreas Rejbrand Sep 27 '11 at 09:49
  • 1
    Have a look at the code! TextWidth calls TextExtent which checks that the canvas is in the required state. You cut out three calls per check. Now, this is not going to set the world on fire, but it saves doing that many times in a loop. – mj2008 Sep 27 '11 at 10:40
  • Time it. My money says that you won't be able to tell the difference. – David Heffernan Sep 27 '11 at 11:19
  • @David, I did indeed time it. My code lays out over 100 pages and in doing so created 100,000 objects, so I did a lot of timing. – mj2008 Sep 27 '11 at 14:30
  • And what were the results of the timing experiment comparing `GetTextExtendPoint32` and `TCanvas.TextWidth`? – David Heffernan Sep 27 '11 at 14:37
  • 1
    Second that. If speed is an ultimate issue and N is large, microoptimization by using direct GDI calls is appropriate. BTW, @mj2008 using Canvas.Handle can be suboptimal too (invokes getter and HandleNeeded/HandleAllocated) – Premature Optimization Sep 27 '11 at 15:37
  • @Downvoter yes, you are right, and in my code I cache the handle too. – mj2008 Sep 27 '11 at 16:05
  • @David, It is some years now since I did the timing, so I don't have the results. But being aware that you can save time when calling this a lot, even if not great, is worth knowing. – mj2008 Sep 27 '11 at 16:07
  • @Downvoter HandledNeeded/HandledAllocated are noops compared to `GetTextExtentPoint32`. – David Heffernan Sep 27 '11 at 16:08
  • GetTextExtentPoint32 is the only correct way to get exact widths with Unicode text. This is the way Windows determines how to fold lines properly, and Microsoft optimized it to the hilt. +1 See: http://stackoverflow.com/questions/3497471/wysiwig-with-unicode/3498368#3498368 – lkessler Sep 28 '11 at 01:57
  • @lkessler: you told me that "it matters". The timing may matter if you must check many many items. For a few (like 4 or 5 per column), it won't. And GetTextExtentPoint32 is what Canvas.TextWidth uses, so these are equivalent WRT accuracy. – Rudy Velthuis Sep 28 '11 at 02:39
  • @Rudy: So sorry. You are correct. Didn't realize TextWidth calls GetTextExtentPoint32. Because you edited your answer, it allowed me to remove the -1 and I gave you a +1 instead. – lkessler Sep 28 '11 at 04:25