To understand this, we have to look at the ASCII representations of letters. It's easiest to do this in base 2.
A 01000001 a 01100001
B 01000010 b 01100010
C 01000011 c 01100011
D 01000100 d 01100100
... ...
X 01011000 x 01111000
Y 01011001 y 01111001
Z 01011010 z 01111010
Notice that the upper-case letters all begin with 010
, and the lower-case letters all begin with 011
. Notice that the lower-order bits are all the same for the upper- and lower-case versions of the same letter.
So: all we need to do to convert a lower-case letter to the corresponding upper-case letter is to change the 011
to 010
, or in other words, turn off the 00100000
bit.
Now, the standard way to turn off a bit is to do a bitwise AND of a mask with a 0 in the position of the bit you want to turn off, and 1's everywhere else. So the mask we want is 11011111
. We could write that as 0xdf
, but the programmer in this example has chosen to emphasize that it's a complementary mask to 00100000
by writing ~32
. 32 in binary is 00100000
.
This technique works fine, except that it will do strange things with non-letters. For example, it will turn '{'
into '['
(because they have the ASCII codes 01111011
and 001011011
, respectively). It will turn an asterisk '*'
into a newline '\n'
(00101010
into 00001010
).
The other way of converting upper to lower case in ASCII is to subtract 32. That, also, will convert 'a'
to 'A'
(97 to 65, in decimal), but if would also convert, for example, 'A'
to '!'
. The bitwise AND technique is actually advantageous in this case because it converts 'A'
to 'A'
(which is what a convert-to-uppercase routine ought to do).
The bottom line is that whether you AND with ~32 or subtract 32, in a properly safe function you're going to have to also check that the character being converted is the right kind of letter to begin with.
Also, it's worth noting that this technique absolutely assumes the 7-bit ASCII character set, and will not work with accented or non-Roman letters of other character sets, such as ISO-8859 or Unicode. (EBCDIC would be another matter.)