I'm just curoius how does .ToUpper() work? Is there a some sort of mapping that a lower a have UTF code XYZ and the upper has UTF code XYZ1?
-
You might be able to use [ILSpy](http://wiki.sharpdevelop.net/ILSpy.ashx) or similar to find out. – George Duckett Jul 12 '12 at 11:47
-
3This question http://stackoverflow.com/questions/297703/how-do-you-set-strings-to-uppercase-lowercase-in-unicode seems related – heijp06 Jul 12 '12 at 11:48
-
1It eventually maps down to InternalChangeCaseString which is not visible to ILSPY – Chriseyre2000 Jul 12 '12 at 11:50
-
Maybe not re. using a decompiler, it's an internal call (at the end of the call stack). – George Duckett Jul 12 '12 at 11:50
-
@Chriseyre2000: Just found that out as you were typing. :) – George Duckett Jul 12 '12 at 11:51
-
`String.ToUpper` calls the CultureInfo's(current or specified) [`TextInfo.ToUpper`](http://msdn.microsoft.com/en-us/library/fsc2y169). – Tim Schmelter Jul 12 '12 at 11:55
-
@TimSchmelter: Which leads to the obvious question, what does `TextInfo.ToUpper` do? – George Duckett Jul 12 '12 at 11:57
-
@GeorgeDuckett: I've posted the docs. ILSpy won't help since `InternalChangeCaseString` is extern. – Tim Schmelter Jul 12 '12 at 11:59
4 Answers
Yes, it's making use of the Unicode metadata. Every character (Unicode code point) has a case as well as case mapping to upper- and lowercase (and title case). .NET uses this information to convert a string to upper- or lowercase. You can find the very same information in the Unicode Character Database.

- 344,408
- 85
- 689
- 683
String.ToUpper just uses the CurrentCulture
in core.
Form disassembled version of String.ToUpper()
from mscorelib.dll
, you can see this:
public string ToUpper(CultureInfo culture)
{
if (culture == null)
{
throw new ArgumentNullException("culture");
}
return culture.TextInfo.ToUpper(this);
}
So it depends on your current culture. There is always a suitable overload of it where you can specifiy alternative culture.
EDIT
Internally it calls nativeChangeCaseString
function at the end with its native implementation. How does it implemented internally, I have no idea, cause it's something that can be answered by person who developed it.
As suggested by @Tim add a link to
TextInfo.ToUpper which provides some more information on subject.

- 61,654
- 8
- 86
- 123
-
3I'm not sure that really answers the question though. Saying "it calls this method internally" without information on what that method does. – George Duckett Jul 12 '12 at 11:54
-
@GeorgeDuckett: I edited my post, but as I wrote to **concrete** implementation can be answered by the person who developed that fucntion. – Tigran Jul 12 '12 at 11:59
-
1@Tigran: You might want to add the link to [`TextInfo.ToUpper`](http://msdn.microsoft.com/en-us/library/fsc2y169) since there are some more informations, for instance that the returned string might differ in length from the input string which proves OP's mapping approach wrong. – Tim Schmelter Jul 12 '12 at 12:10
This has been asked before (in a round-about) way on StackOverflow. Granted, it's not about C# or .NET, but answers the Unicode part of this question.
If you are interested in design aspects of ToUpper() implementation then you can refer to following sections:
- FlyWeight design pattern from Gang of Four design pattern catalog is used to handle character related functionality
- As per this design pattern each unit in the collection is designed as an object which has defined behavior, the final object is collection of smaller units
- In the case of String - the given String is actually handled as array of characters, where each character is an object with defined behavior
- Going with this design pattern when we call ToUpper(), it iterates over the characters of string and internally delegates the call to each character. While calling ToUpper on character, String class also passes reference of Locale which contains details of character map and encoding
If you are interested in actual implementation then you can refer to the open source implementation of java.lang.String class part of Java language - this is equivalent to C# string utility class.
Following are the links where you can find source code of java.lang.String class - there are 2 overloaded methods: toUpper() and toUpper(Locale). internally toUpper() calls toUpper(Locale) with default locale, so the second method will of interest to you.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java
Hope this information helps.

- 17,065
- 2
- 26
- 22