1

I have two compare to chars: '2' < 'a'. I typed this into my console and it returned true. Why is that?

amalloy
  • 89,153
  • 8
  • 140
  • 205
schmanh
  • 29
  • 2
  • 1
    I'm curious: what did you expect to happen instead? Why? – Daniel Wagner Oct 23 '22 at 20:11
  • The characters `'2'` and `'a'` are not equal, so one of them has to be less than the other (or we could simply remove the `Ord` instance for `Char`, but that would be very inconvenient). But really it's arbitrary. Representing characters by numeric codes fundamentally imposes an easy order on all characters that you can represent, but honestly you probably shouldn't worry about what that specific order is except within a few specific classes (i.e. letters obviously come in alphabetic order, digits in numeric order, but anything else is much more "just because" than anything meaningful). – Ben Oct 24 '22 at 02:49
  • 1
    The answer to this question will also be almost exactly the same in almost any programming language. Especially if you're not going beyond the ASCII character set. This isn't a weird quirk of Haskell or anything, it's a general consensus for how characters work in programming. – Ben Oct 24 '22 at 02:51

2 Answers2

8

Because the character 2 comes before the character a in the ASCII table.

Indeed, 2's decimal code is 50, whereas a's decimal code is 97, so the latter occurs 97 - 50 = 47 characters after the former, as demonstrated by this:

iterate succ '2' !! 47

which gives back 'a'. iterate has signature (a -> a) -> a -> [a], and guess what it does, iterate f x is the infinite list [x, f x, f (f x), f (f (f x)), …], from which we take the 47th element via !! 47.

Enlico
  • 23,259
  • 6
  • 48
  • 102
  • 5
    `Char` represents a Unicode character, so it would be more accurate to say that the code point for `'2'` is less than the code point for `'a'`. (ASCII is just one *encoding* of a subset of the possible `Char` values.) – chepner Oct 23 '22 at 13:57
1

Both values, 2 and a are of type Char.

-- GHCi

:t '2'    -- Char
:t 'a'    -- Char

In Haskell, a Char is a 32 bit integer. So, the question is, can we use the operator < to compare these integers? Yes!

:i Char   -- instance Ord Char -- Defined in `GHC.Classes'

So, Haskell says that Char can be ordered, because it is in class Ord, and we can use operators like < to compare them.

If I go on Hoogle, the Haskell version of Google, I can see the definition of the Ord class, by simply typing Ord into the search box.

(<), (<=), (>), (>=) :: a -> a -> Bool
max, min             :: a -> a -> a

compare x y = if x == y then EQ
              -- NB: must be '<=' not '<' to validate the
              -- above claim about the minimal things that
              -- can be defined for an instance of Ord:
              else if x <= y then LT
              else GT

x <  y = case compare x y of { LT -> True;  _ -> False }
x <= y = case compare x y of { GT -> False; _ -> True }
x >  y = case compare x y of { GT -> True;  _ -> False }
x >= y = case compare x y of { LT -> False; _ -> True }

Char is a type which is not quite UTF-32 compliant. In UTF-32, the ASCII codes are stored in the bottom 7 bits, 00 through 7F hex. In ASCII, a is 97 dec, 61 hex, and 2 is 50 dec, 32 hex and hence when we compare the 32 bit values, 2 is indeed less than a. The declared function (<) returns True.

Francis King
  • 1,652
  • 1
  • 7
  • 14
  • 4
    It's the Unicode *code points*, independent of any particular encoding (ASCII, UTF-32, etc) that are used by the `Ord` instance for `Char`. `'2'` is U+0032 and `'a'` is U+0061, and 0x32 is less than 0x61. – chepner Oct 23 '22 at 14:00