Java was designed after C/C++ with some discussed topics:
Java text String
(Reader/Writer
) contains Unicode so text in mixed scripts may be combined. Internally String was an array of UTF-16 char
; a .class file uses UTF-8 string constants. Hence byte[]
(InputStream/OutputStream
) is only for binary data. Between text and binary data there is always a conversion using the binary data's encoding/charset.
Numerical primitive types exist only in the signed version. (Except char
that one could consider non-numeric.) The idea was to root out signed/unsigned "problems" of C++. So also byte
is signed from -128 to 127. However overflow is irrelevant in java too,
so one can do:
byte b = (byte) 255; // 0xFF or -1
The primitive types byte/short/int/long
have a fixed size of bytes, where C was notorious cross-platform, and things like C uint32
a bit ugly (32).
Having experienced tricky C bugs with signed/unsigned myself (before >10 years), I think this decision was okay.
It is easier to calculate in a signed mind set, and then at the end consider values as unsigned, than have throughout the expressions signed and unsigned parts.
Nowadays there is support in java for calculations respecting an unsigned interpretation of values, like Integer.parseUnsignedInt.