55

For some inexplicable reason the byte primitive type is signed in Java. This mean that valid values are -128..127 instead of the usual 0..255 range representing 8 significant bits in a byte (without a sign bit).

This mean that all byte manipulation code usually does integer calculations and end up masking out the last 8 bits.

I was wondering if there is any real life scenario where the Java byte primitive type fits perfectly or if it is simply a completely useless design decision?


EDIT: The sole actual use case was a single-byte placeholder for native code. In other words, not to be manipulated as a byte inside Java code.


EDIT: I have now seen a place where an inner tight loop needed to divide by 7 (numbers 0..32) so a lookup table could be done with bytes as the datatype so the memory usage could be kept low thinking of L1 cache usage. This does not refer to the signed/unsignedness but was a case of an actual usage.

Thorbjørn Ravn Andersen
  • 73,784
  • 33
  • 194
  • 347
  • 8
    I frequently write code that manipulates byte arrays. It's a common enough thing to do. – jahhaj Jul 31 '11 at 21:33
  • 1
    @Chris Lively - I disagree. Java is mostly good design decisions, with a few bad apples – Bozho Jul 31 '11 at 21:35
  • 8
    It's not just just the `byte` that is signed. Except for `char`, I believe all of the primitive integer data types (`int`, `short`, `long`, and `byte`) are signed. There is no unsigned modifier that can be applied to any of them. In the sense of consistency, it makes sense. I think the real question is if a lack of unsigned integer data types makes sense. – Thomas Owens Jul 31 '11 at 21:38
  • unsigned types in C/C++ was the bigger mistake – jahhaj Jul 31 '11 at 21:43
  • 1
    @Chris, the "Write Once Run Everywhere"-thing is enough for us to live with most of the bad stuff. They got _that_ right. – Thorbjørn Ravn Andersen Jul 31 '11 at 21:45
  • 3
    @Thorbjørn Ravn Andersen: hmm... my Java friends actually say that a little differently: "Write Once, Debug Everywhere" ;) All in good fun. Either way, +1 for a good question. – NotMe Jul 31 '11 at 22:45
  • @Chris, not for the things we do. – Thorbjørn Ravn Andersen Jul 31 '11 at 23:28
  • A downvote? For this question? I'd love to know why?!? – Thorbjørn Ravn Andersen Jul 31 '11 at 23:29
  • Why do you say "end up masking out the last 8 bits." ? If integer calculation is made, 32 bit calculation is made, and a byte is 8 bits right? Then 24 bits should be maxed out, no? – Koray Tugay Jan 18 '15 at 10:06

9 Answers9

33

Josh Bloch recently mentioned in a presentation that this is one of the mistakes in the language.

I think the reason behind this is that java does not have unsigned numeric types, and byte should conform to that rule. (Note: char is unsigned, but does not represent numbers)

As for the particular question: I can't think of any example. And even if there were examples, they would be fewer than the ones for 0..255, and they could be implemented using masking (rather than the majority)

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
  • I know they are fewer. I was curious if they even existed. – Thorbjørn Ravn Andersen Jul 31 '11 at 22:07
  • He mentioned it much earlier in "Java Puzzlers" (2005) with Neal Gafter. From puzzle 24: "The lesson for language designers is that sign extension of byte values is a common source of bugs and confusion." – toto2 Jul 31 '11 at 22:34
  • 6
    Gosling: ["Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand ... unsigned arithmetic"](http://stackoverflow.com/questions/430346/why-doesnt-java-support-unsigned-ints), so they decided to leave it out. I think their primary mistake was to believe that the use case for `byte` was essentially the same as for `int`. – j-g-faustus Aug 01 '11 at 02:41
  • 2
    @j-g-faustus: C's problems with unsigned types all stem from the fact that operations between a signed type and unsigned type of the same size coerce both to unsigned. Unsigned types smaller than `int` would have posed no problem whatsoever, since there would be no signed type smaller than `int` they could interact with. As for larger types, IMHO there should be three: a 32-bit natural number, a 32-bit abstract algebraic ring, and a 64-bit abstract algebraic ring. Operations on numbers should either yield arithmetically correct results or throw an overflow exception, ... – supercat Jan 30 '14 at 23:09
  • 1
    ...and operations--*other than comparisons*--between numbers and rings should implicitly convert the numbers to rings, but rings should not convert back to numbers. Things like `hashCode` which make use of integer wrapping behavior should use the 32-bit ring type for calculations, and then convert back to `int` using a suitable method to convert the ring type to a signed integer. – supercat Jan 30 '14 at 23:15
16

byte, short, char types are mostly useless, except when used in arrays to save space.

Neither Java or JVM has any real support for them. Almost all operations on them will promote them to int or long first. We cannot even write something like

short a=1, b=2;
a = a + b;  // illegal
a = a << 1; // illegal

Then why the heck even bother with defining operations on byte, short, char types at all?

All they do are sneaking in widening conversions that will surprise the programmer.

Abdullah Khan
  • 12,010
  • 6
  • 65
  • 78
irreputable
  • 44,725
  • 9
  • 65
  • 93
  • 5
    +1 but `char` can sometimes be useful as a character, not a number – Laurent Pireyn Aug 01 '11 at 14:32
  • if you need to perform calculations on `char` data, you are better off using `int` instead. Fortunately, casting between `char` and `int` are much more pleasant than say `byte` and `int` – irreputable Aug 01 '11 at 15:31
10

Amazingly, I just used byte in Java for the first time last week, so I do have an (albeit unusual) use-case. I was writing a native Java function, which lets you implement a function in a library that can be called by Java. Java types need to be converted to types in the native language, in this case C

The function needed to take an array of bytes, but (forgetting about the byte type entirely at the time) I had it take a char[]. The signature Java generates for the C function gives the type of that parameter as jcharArray, which can be converted to a bunch of jchars, which are typedef-ed in jni.h to unsigned short. Naturally, that is not the same size -- it's 2 bytes instead of 1. This caused all sorts of problems with the underlying code. Making the Java type byte[] resulted in a jbyteArray, and jbyte on Linux is typedef-ed to signed char, which is the right size

Michael Mrozek
  • 169,610
  • 28
  • 168
  • 175
4

Digitized sound (or any other signal) with 8 bit signed samples seems like the only reasonable example to me. Of course having signed bytes is no requirement to handling such signals and it can be argued whether Java byte "fits perfectly".

Personally I think not having unsigned is a mistake. Not only because there's more use for unsigned bytes/ints but because I prefer a stronger type system. It would be nice to be able to specify that negative numbers are not valid and allow compiler checks and runtime exceptions for violations.

lokori
  • 436
  • 5
  • 6
  • Yeah, I think variants like `ubyte`, `ushort` and `uint` could be useful in some scenarios, like bitmasking, or processing streams of bytes. – DragShot Jul 04 '17 at 15:34
3

byte has an extensive use in applet development for Java Card. Because cards have limited resources every bit of memory is precious. By the way card processors have limitations in processing of integer values. int type support is optional and java.lang.String is not supported so all integer operation and data storage is done by byte and short variables and arrays. As integer literals are of int type, they should be explicitly cast to byte or short in whole code. Communication with card goes through APDU commands that is handed to applet as an array of bytes that should be decomposed to bytes to decode command class, instruction and parameters. Looking at the following code you see how much byte and short types are important to Java Card development:

package somepackage.SomeApplet;

import javacard.framework.*;
import org.globalplatform.GPSystem;
import org.globalplatform.SecureChannel;

public class SomeApplet extends Applet {

    // Card status
    private final static byte ST_UNINITIALIZED     = (byte) 0x01;
    private final static byte ST_INITIALIZED       = (byte) 0x02;

    // Instructions & Classes
    private final static byte PROP_CLASS           = (byte) 0x80;     

    private final static byte INS_INIT_UPDATE      = (byte) 0x50;
    private final static byte INS_EXT_AUTH         = (byte) 0x82;

    private final static byte INS_PUT_DATA         = (byte) 0xDA;
    private final static byte INS_GET_RESPONSE     = (byte) 0xC0;
    private final static byte INS_GET_DATA         = (byte) 0xCA;


    private final static short SW_CARD_NOT_INITIALIZED       = (short) 0x9101;  
    private final static short SW_CARD_ALREADY_INITIALIZED   = (short) 0x9102;  

    private final static byte OFFSET_SENT = 0x00;
    private final static byte OFFSET_RECV = 0x01;
    private static short[] offset;

    private static byte[] fileBuffer;
    private static short fileSize = 0;

    public static void install(byte[] bArray, short bOffset, byte bLength) {
        new SomeApplet( bArray, bOffset, bLength);
    }

    public RECSApplet(byte[] bArray, short bOffset, byte bLength) {
        offset = JCSystem.makeTransientShortArray((short) 2, JCSystem.CLEAR_ON_RESET);
        fileBuffer = new byte[FILE_SIZE];

        byte aidLen = bArray[bOffset];
        if (aidLen== (byte)0){
            register();
        } else {
            register(bArray, (short)(bOffset+1), aidLen);
        }
    }

    public void process(APDU apdu) {
        if (selectingApplet()) {
            return;
        }
        byte[] buffer = apdu.getBuffer();
        short len = apdu.setIncomingAndReceive(); 

        byte cla = buffer[ISO7816.OFFSET_CLA];
        byte ins = buffer[ISO7816.OFFSET_INS];
        short lc = (short) (buffer[ISO7816.OFFSET_LC] & 0x00ff); 

        while (len < lc) {
            len += apdu.receiveBytes(len);
        }

        SecureChannel sc = GPSystem.getSecureChannel();
        if ((short)(cla & (short)0x80) == ISO7816.CLA_ISO7816) {
            switch (ins) {
                case INS_PUT_DATA:
                    putData(buffer, ISO7816.OFFSET_CDATA, offset[OFFSET_RECV], len);

                    if ((cla & 0x10) != 0x00) {
                        offset[OFFSET_RECV] += len;
                    } else {
                        fileSize = (short) (offset[OFFSET_RECV] + len);
                        offset[OFFSET_RECV] = 0;
                    }
                    return;

                case INS_GET_DATA:
                case INS_GET_RESPONSE:
                    sendData(apdu);
                    return;
                default:
                    ISOException.throwIt(ISO7816.SW_INS_NOT_SUPPORTED);
            }

        }
        else if ((byte) (cla & PROP_CLASS) == PROP_CLASS) {
            switch (ins) {
                case INS_INIT_UPDATE:
                case INS_EXT_AUTH:
                    apdu.setOutgoingAndSend(ISO7816.OFFSET_CDATA, sc.processSecurity(apdu));
                    return;
                default:
                    ISOException.throwIt(ISO7816.SW_INS_NOT_SUPPORTED);
            }
        } else
            ISOException.throwIt(ISO7816.SW_CLA_NOT_SUPPORTED);
    }

    // Some code omitted

}
Bahribayli
  • 346
  • 2
  • 11
1

I think it is signed in order to be consistent with short and int.

As to whether it is used much, it makes the notion of "byte arrays" a construct rather than a primitive.

That's really all I have. :)

Ray Toal
  • 86,166
  • 18
  • 182
  • 232
  • @EJP is that fair? that would mean +8 net rep – Louis Rhys Aug 01 '11 at 04:46
  • 3
    @Louis Rhys if the downvoter had bothered to explain the downvote we could discuss it. As it is all I can see is that the answer is perfectly correct, and certainly sensible, and didn't merit the downvote in the first place. – user207421 Aug 02 '11 at 09:50
1

On a machine with words larger than 8 bits it's somewhat useful if you wish to store a lot of values that fit in an 8-bit range into a single array, but typically it's not a good idea to use them otherwise since a byte is actually more effort to get out of memory than an int.

Remember though that Java was designed for very small consumer devices (set-top TV boxes). I expect if it had been used this way on small 8-bit microprocessors it would have been more useful as it would fit the word size exactly and could be used for general "Math" operations on a very small scale.

The only reason I can see to make it signed is that an unsigned byte interacting with an int can be a little confusing--but I'm not convinced it's any more confusing than a signed one is!

Bill K
  • 62,186
  • 18
  • 105
  • 157
-1

The size of byte is 8 bits. The size of byte helps in processing input and output while performing functions like writing to a file or reading from a file. Consider a scenario in which you want to read an input from the keyboard or from any file. If you use the "byte" data structure, you know that you are receiving one character at a time since the size is 8 bits. So every time you receive an input stream, you know that you are actually receiving one character at a time.

4castle
  • 32,613
  • 11
  • 69
  • 106
-2

I used it frequently when I was programming software and games for J2ME. On most J2ME-devices, you have limited resources, so storing for example the map of a level in a byte-array is less resource-intensive than storing it in an int-array.

Uooo
  • 6,204
  • 8
  • 36
  • 63