0

I'm implementing case folding in a function for JavaScript. I've code generated out of the CaseFolding.txt file the needed JavaScript code.

The documentation for U_FOLD_CASE_DEFAULT says that ICU will use:

default mappings defined in CaseFolding.txt.

I would like to confirm that using types C and F are the default option. I know T is excluded per the documentation.

A. To do a simple case folding, use the mappings with status C + S.

B. To do a full case folding, use the mappings with status C + F.

The mappings with status T can be used or omitted depending on the desired case-folding behavior. (The default option is to exclude them.)

Am I selecting the correct values out of CaseFolding.txt?

一二三
  • 21,059
  • 11
  • 65
  • 74
Daniel A. White
  • 187,200
  • 47
  • 362
  • 445

2 Answers2

1

It depends on whether there is access to a buffer: U_FOLD_CASE_DEFAULT refers to the C + S mappings when using the character functions in uchar.h (like u_foldCase()), and the C + F mappings when using the string functions in ustring.h (like u_strFoldCase()).

In general, you should prefer to use the C + F mappings whenever possible (e.g., you don't need a 1:1 mapping).

一二三
  • 21,059
  • 11
  • 65
  • 74
0

Yes, that is the default. According to section 3.13 (PDF) of the current Unicode standard (8.0.0), default case folding:

... Uses the mappings with the status field value “C” or “F” in the data file CaseFolding.txt in the Unicode Character Database.

DPenner1
  • 10,037
  • 5
  • 31
  • 46