I'd have thought that's what the LengthInTextElements property was for. The MSDN says this property is:
The number of base characters, surrogate pairs, and combining character sequences in this StringInfo object.
So it sure looks like it should count combining sequences as a single character. But either it doesn't work or I am fundamentally misunderstanding something. This crappy test program ...
static void Main(string[] args)
{
string foo = "\u0301\u0065";
Console.WriteLine(string.Format("String:\t{0}", foo));
Console.WriteLine(string.Format("Length:\t{0}", foo.Length));
Console.WriteLine(string.Format("TextElements:\t{0}", new StringInfo(foo).LengthInTextElements));
Console.ReadLine();
}
generates this output...
String: `e
Length: 2
TextElements: 2
I would dearly love to count the combining sequence "\u0301\u0065" as a single character. Can this be done with StringInfo?
Well, I figured out what I was doing wrong and it's somewhat embarrassing. I was reversing the order of the character and the diacritic. So making the following ever so tiny change corrects the problem:
static void Main(string[] args)
{
string foo = "\u0065\u0301";
Console.WriteLine(string.Format("String:\t{0}", foo));
Console.WriteLine(string.Format("Length:\t{0}", foo.Length));
Console.WriteLine(string.Format("TextElements:\t{0}", new StringInfo(foo).LengthInTextElements));
Console.ReadLine();
}
So ... it was just a matter of correctly encoding my test data.