1

I have a string that needs to be converted such that it converts first character to Upper case. with ToTitleCase method it works fine except for the case when there is a special characters.

Below is the code and expected result

String textToConvert= "TEST^S CHECK"
TextInfo myTI = new CultureInfo("en-US",false).TextInfo;
       return myTI.ToTitleCase(textToConvert.ToLower())

Expected result: Test^s Check But the result is coming out as Test^S Check with "S" converted to capital after special character ^

Is there anyway to change th conversion to expected result

Hadi Samadzad
  • 1,480
  • 2
  • 13
  • 22
user2081126
  • 77
  • 1
  • 11
  • 1
    Not with `ToTitleCase`, though they do reserve the rights to make the function slower in the future ;) - for this, now, you'll have to roll your own (which should not be hard to do). – 500 - Internal Server Error Dec 09 '19 at 11:31
  • 1
    What would you expect this string to become? `"TEST.S CHECK"`? – Lasse V. Karlsen Dec 09 '19 at 11:32
  • @LasseV.Karlsen: It should be Test.s Check – user2081126 Dec 09 '19 at 11:34
  • 1
    Symbol `^` is called "Circumflex accent"and in Unicode is declared as "modifier symbol". In C# there is a [small static function](https://referencesource.microsoft.com/#mscorlib/system/globalization/textinfo.cs,b51cc381eb6a95dc) that determines if character is word delimiter, and modifier symbols are word delimiters. This means that the following character S is a separate word as far as .NET is concerned. – Dialecticus Dec 09 '19 at 11:39
  • Just be aware that ToTitleCase is actually working just fine, it's just that what you want is not what it supports. In other words, this handling is not a bug, it's supposed to be like this. – Lasse V. Karlsen Dec 09 '19 at 11:43

2 Answers2

2

ToTitleCase is a handy method, but if you need more fine grained control, Regex might be the better option:

string titleCase = Regex.Replace(textToConvert.ToLower(), @"^[a-z]|(?<= )[a-z]",
    match => match.Value.ToUpper());

^[a-z]|(?<=\s)[a-z] will match a letter at the start of the string, and letters preceded by whitespace (space, tab or newline).

Johnathan Barclay
  • 18,599
  • 1
  • 22
  • 35
1

Well, ToTitleCase turn 1st letter of each word to upper case while all the other to lower case. Word in terms of .Net is a consequent letters, and, alas, ^ is not a letter, that's why TEST^S consists of 2 words.

We can redefine word as

  • word must start from letter
  • word can contain letters, apostrophes ', circumflexes ^, and full stops .

In this case we can use regular expressions:

  using System.Text.RegularExpressions;

  ... 

  string source = "TEST^S CHECK по-русски (in RUSSIAN) it's a check! a.b.c.d";

  string result = Regex.Replace(source, @"\p{L}[\p{L}\^'\.]*",
    match => match.Value.Substring(0, 1).ToUpper() + match.Value.Substring(1).ToLower());

  Console.Write(result);

Outcome:

  Test^s Check По-Русски (In Russian) It's A Check! A.b.c.d
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215