0

I am getting to the last stage of my rope (a more scalable version of String) implementation. Obviously, I want all operations to give the same result as the operations on Strings whenever possible.

Doing this for ordinal operations is pretty simple, but I am worried about implementing culture-sensitive operations correctly. Especially since I know only two languages and in both of them culture-sensitive operations behave precisely the same as ordinal operations do!

So are there any specific things that I could test and get at least some confidence that I am doing things correctly? I know, for example, about ß being equal to SS when ignoring cases in German; about dotted and undotted i in Turkish.

Alexey Romanov
  • 167,066
  • 35
  • 309
  • 487

3 Answers3

2

Surrogate pairs, if you plan to support them - including invalid combinations (e.g. only one part of one).

If you're doing encoding and decoding, make sure you retain enough state to cope with being given arbitrarily blocks of binary data to decode which may end half way through a character, with the remaining half coming in the next character.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
1

The Turkish test is the best I know :)

leppie
  • 115,091
  • 17
  • 196
  • 297
1

You should mimic the String methods implementations and use the core library to do this for you. It is very hard to take into account every possible aspect of every culture. Instead of re-inventing the wheel use reflector on the String methods and see the internal calls. For example String.Compare uses CultureInfo.CurrentCulture.CompareInfo.Compare for comparing 2 strings in current culture.

Diadistis
  • 12,086
  • 1
  • 33
  • 55
  • Yes, that's the plan. However, CultureInfo methods take strings. This means I need to convert a part of my rope into a string. The question is, do I have enough information to know which part? – Alexey Romanov Jan 12 '09 at 20:07
  • For example, when checking EndsWith(string suffix), is it enough to take the last suffix.Length characters of my rope? Probably not always. Is it enough to take the last suffix.Length + 5 characters? Probably yes. – Alexey Romanov Jan 12 '09 at 20:09
  • You don't need to know, just pass the rope string to the appropriate CultureInfo method : CultureInfo.CurrentCulture.CompareInfo.IsSuffix(rope.ToString(), suffix, CompareOptions.None); // Taken from String.EndsWith – Diadistis Jan 12 '09 at 20:15
  • Well, converting the entire Rope to string for each such operation would rather kill performance and most of the point of implementing Rope. – Alexey Romanov Jan 12 '09 at 20:29