2

With the requirement that I need to use the Intl.Collator (English language only) to sort strings that contains both letters and numbers (anywhere in the string). I need to have them sorted such that:

  1. Capital letters come before lower-case letters when they're the same (eg. A > a, a > b, a > B, A > B, A > b)
  2. Numbers in the string are sorted numerically (eg. 1 > 2, 1 > 10, 2 > 10)
  3. Numbers sort before letters (numbers > letters) when at the same position.

I am using this:

return new Intl.Collator(locale, { caseFirst: 'upper', numeric: true, sensitivity: 'variant' })
  .compare(stringA, stringB);

You can see an explanation of the Collator options object here: https://reference.codeproject.com/javascript/reference/global_objects/collator

This works fine when the numbers are at the beginning of the string:

'9elliot' > '12morgan',
'54mary' > '54Ralph',
'23John' > '23john'

But I have run across a case where sorting fails and I cannot figure out why:

console.log(['f1oobar', 'F2oobar'].sort(
  new Intl.Collator('en', { caseFirst: 'upper', numeric: true, sensitivity: 'variant'}).compare
));

// Prints ["f1oobar", "F2oobar"]

I need it to sort as ["F2oobar", "f1oobar"] because Capital "F" should come before lower-case "f".

I have tried all variants of caseFirst, numeric and sensitivity with no change in the result. I've even removed numeric and sensitivity with no change.

Who can explain and/or solve this?

Russ
  • 623
  • 8
  • 14
  • And what if "bird" comes before "Boston"? – n-- Aug 19 '22 at 21:26
  • It should not. B > b – Russ Aug 20 '22 at 03:46
  • 1
    Well, it is so, and I bet, you can't do anything to change that. Those configuration parameters, like `caseFirst`, are mostly hints for collator and not a hard enforcement. If you really have no choice but to use collator, you may want to proceed with [common concepts](https://unicode-org.github.io/icu/userguide/collation), then continue with a ["deeper" reading.](http://www.unicode.org/reports/tr35/tr35-collation.html#Case_Parameters) – n-- Aug 20 '22 at 09:30
  • That seems like some kind of a bug or unexpected behaviour – Konrad Aug 20 '22 at 10:07
  • Well, that is discouraging. Their high-level documentation needs a slight bit more detail. Their deeper one is disappointing. I may have to come up with my own algorithm. The scary part is when we need to start supporting other languages... I'd be happy if the numeric sorting parts worked correctly (anywhere in the string), even if it's case insensitive. – Russ Aug 22 '22 at 14:00

0 Answers0