4

When executing the below code, the code is executed perfectly without any errors, but for a variable of type List<Integer> , the return type of get() method should be Integer, but while executing this code, when I call x.get(0) a string is returned, whereas this should throw an exception.

public static void main(String[] args)
      {
            ArrayList xa = new ArrayList();
            xa.addAll(Arrays.asList("ASDASD", "B"));
            List<Integer> x = xa;
            System.out.println(x.get(0));
      }

But while executing the below code, just adding the retrieval of class from the returned object to the previous code block throws a class cast exception. If the above code executes perfectly the following should also execute without any exception:

public static void main(String[] args)
      {
            ArrayList xa = new ArrayList();
            xa.addAll(Arrays.asList("ASDASD", "B"));
            List<Integer> x = xa;
            System.out.println(x.get(0).getClass());
      }

Why does java execute a type conversion while fetching the class type of the object?

Didier L
  • 18,905
  • 10
  • 61
  • 103
Aman J
  • 452
  • 4
  • 15

2 Answers2

6

The compiler has to insert type checking instructions at the byte code level where necessary, so while an assignment to Object, e.g. Object o = x.get(0); or System.out.println(x.get(0));, may not require it, invoking a method on the expression x.get(0) does require it.

The reason lies in the binary compatibility rules. Simply said, it is irrelevant whether the invoked method has been inherited or explicitly declared by the receiver type, the formal type of the expression x.get(0) is Integer and you are invoking the method getClass() on it, hence, the invocation will be encoded as an invocation of a method named getClass with the signature () → java.lang.Class on the receiver class java.lang.Integer. The facts that this method has been inherited from java.lang.Object and that it was declared final at compile time, are not reflected by the compiled class.

So in theory, at runtime, the method could have been removed from java.lang.Object and a new method java.lang.Class getClass() added to java.lang.Integer without breaking the compatibility to that specific code. While we know that this will never happen, the compiler is just following the formal rules not to inject assumptions about the inheritance into the code.

Since the invocation will be compiled as an invocation targeting java.lang.Integer, a type cast is necessary before the invocation instruction, which will fail in the Heap Pollution scenario.

Note that if you change the code to

System.out.println(((Object)x.get(0)).getClass());

you will make the assumption explicit that the method has been declared in java.lang.Object. The widening to java.lang.Object will not generate any additional byte code instruction, all this code does, is changing method invocation’s receiver type to java.lang.Object, eliminating the need for a type cast.

There is an interesting deviation from the rules here, that the compiler does encode the invocation as an invocation on java.lang.Object on the bytecode level, if the method is one of the known final methods declared in java.lang.Object. This might be due to the fact that these specific method are specified in the JLS and encoding them in this form allows the JVM to identify these special methods quickly. But the combination of the checkcast instruction and the invokevirtual instruction still exhibits the same, compatible behavior.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • isn't the `checkcast` added because of type-erasure, not because of binary compatibility? – Eugene Jun 21 '17 at 22:24
  • 1
    @Eugene: it’s the combination of type erasure, which is the reason why the reference type is `Object` and the binary compatibility, which is the reason why why need `Integer` even if we could invoke `Object#getClass()` instead. Since the question focuses on the difference between two cases, both subject to type erasure, the relevant difference between them is the binary compatibility issue, which applies only to the second case. – Holger Jun 22 '17 at 06:51
2

It's because of the PrintStream#println:

public void println(Object x) {
    String s = String.valueOf(x);
    ...

See how it converts anything you give it to a String, but first assigning it to an Object (which works because Integer is an Object). Change your first code to:

    ArrayList xa = new ArrayList();
    xa.addAll(Arrays.asList("ASDASD", "B"));
    List<Integer> x = xa;
    Integer i = x.get(0);
    System.out.println(i);

and you will get the same failure.

EDIT

Yes, Didier is right in his comment; thus after thinking for a while the update.

This can the even simplified like this to understand why the compiler is inserting the extra checkcast #5 // class java/lang/Integer:

 ArrayList<Integer> l = new ArrayList<>();
 l.get(0).getClass();

At runtime the there's no Integer type, just plain Object; which would compile among other things to :

  10: invokevirtual #4 // Method java/util/ArrayList.get:(I)Ljava/lang/Object;
  13: checkcast     #5 // class java/lang/Integer
  16: invokevirtual #6 // Method java/lang/Object.getClass:()Ljava/lang/Class;

Notice the checkcast to check that the type that we get from that List is actually an Integer. List::get is a generic method, and that generic parameter at runtime would be an Object; to maintain the correct List<Integer> at runtime the checkcast is needed.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • 1
    That does not really explain why the compiler generates a cast when calling `getClass()` – Didier L Jun 21 '17 at 10:12
  • @DidierL well the reference type is `List`; seems pretty logic to me – Eugene Jun 21 '17 at 10:29
  • Well, I would expect a cast when assigning to an `Integer` variable (as in your example), when calling a method that belongs to `Integer`, or when passing it to a method that takes an `Integer` as argument. Without Holger's explanation, I would not expect a cast when calling a method that is declared in `Object` like `getClass()`. – Didier L Jun 21 '17 at 10:38
  • 2
    The interesting point about your disassemby output is that the `checkcast` instruction would not be necessary for the execution. It got me thinking again, so I just verified that the application of the formal rules still holds for any other class, the invocation will be encoded with the exact receiver type, which makes the `checkcast` necessary. So the special treatment of `java.lang.Object`’s method in the invocation instruction is not supposed to affect the other instructions, most notable, not intended to allow to elide the `checkcast`… – Holger Jun 22 '17 at 07:45
  • @Holger that is so intriguing! indeed you can basically throw away the `checkcast` here; this looks like a fail fast scenario to me ... – Eugene Jun 22 '17 at 09:19
  • @Holger also, this `Integer i = 12;i.getClass();` would compile to `Object.getClass` too; which again is not obvious at all. – Eugene Jun 22 '17 at 09:30
  • As said, that seems to be a property of all invocations of these `final` methods declared in `java.lang.Object`—and only those. E.g., if you declare `enum Foo { BAR }`, then `Foo.BAR.getDeclaringClass()` gets compiled as `Foo.getDeclaringClass()`, rather than `java.lang.Enum.getDeclaringClass()` whereas `Foo.BAR.getClass()` gets compiled as `java.lang.Object.getClass:()`. – Holger Jun 22 '17 at 10:35