4

Edit: Bug has been filed.

Let's say I have two ArrayLists that point to each other (circular reference):

x = createObject("java", "java.util.ArrayList").init();
y = createObject("java", "java.util.ArrayList").init();
x.add(y);
y.add(x);

If I call hashCode on any of them, it causes a StackOverflowError due to the ArrayList implementation. This is to be expected.

However, when I invoke System.identityHashCode, isn't it supposed to use the Object.hashCode implementation, which won't follow the elements in an ArrayList and thus won't cause a StackOverflowError?

Documentation of identityHashCode states:

Returns the same hash code for the given object as would be returned by the default method hashCode(), whether or not the given object's class overrides hashCode().

In Adobe ColdFusion, this code works fine:

System = createObject("java", "java.lang.System");

System.identityHashCode(x); // returns some integer
System.identityHashCode(y); // returns another integer

(This obviously also works when natively compiled and run with Java.)

In Lucee however, it immediately causes a StackOverflowError:

lucee.runtime.exp.NativeException: java.lang.StackOverflowError
    at java.base/java.util.ArrayList.hashCodeRange(ArrayList.java:627)
    at java.base/java.util.ArrayList.hashCode(ArrayList.java:614)
    at java.base/java.util.ArrayList.hashCodeRange(ArrayList.java:627)
    at java.base/java.util.ArrayList.hashCode(ArrayList.java:614)
    [...]

Why is it running the ArrayList implementation of hashCode here?

Both CFML engines run with the same JVM (HotSpot) and Java version (11) on the same servlet (Tomcat 9). I'd like to understand why they behave differently.

Alex
  • 7,743
  • 1
  • 18
  • 38

1 Answers1

6

The implementation of System.identityHashCode is native - its implemented at the VM level; it's not java code. The specification of iHC is intentionally vague.

The reason it's vague is because it's highly platform dependent, and the spec is trying to give sufficient leeway to implementers of VMs on exotic platforms (and ColdFusion and Lucee certainly count, no?) to make an impl that fits spec.

It's technically possible for the hashCode impl of Object to scan fields, though that would be highly inefficient (because System.iHC is used a ton in places where a fast response is required, and that would be anything but), and you're not the only one who makes the assumption that System.iHC would not loop forever even in an object that (eventually) references itself.

But, the key point is, those are widely used assumptions; nothing in the spec of that method actually says it works that way.

On the flipside, the leeway that lucee is taking (if what you say is indeed true), is rather excessive.

Thus, you're now well stuck. These things are all true:

  • Tons of code assumes iHC cannot loop. It is therefore, de facto, highly impractical that a VM impl would loop.
  • Tons of code assumes iHC is fast. It is therefore, de facto, highly impractical that a VM impl would be slow.
  • Nevetheless, Lucee can go the 'I am rubber and you are glue, whatever you bug report bounces off of me and sticks to you' route on this and just tell you that their impl is valid according to spec and therefore whatever code you care to toss at em to make your point is at fault.

But give them a chance first, before you assume they'll take the open door 'nu uh! Not our fault!' route that is technically available to them here.

If they deny your bug report, and/or you want to beef it up some before you file it at their bugtracker, here are some things you may wish to investigate:

  • Make a self-reffing object and use it as key in an IdentityHashMap. Does this stackoverflow? That'd be a good one to lead with, because now you're showing them the severity of this issue: Either they admit that a core class in java.util is buggy, or they admit their code is buggy, or they take the even more exotic position that the specifications of IHM and System.iHC combine to conclude that any code that attempts to use self-ref objects as keys in an IHM is buggy. That's probably where they end up if they don't want to accept this bug, so prepare yourself for disappointment.
  • Find a couple of libraries and show that they just don't work on lucee. One place to look is serialization libraries that state that they support self-ref / cloned refs (such as a list that contains itself, or contains the same obj ref multiple times).
  • What does == behave like, in lucee? Does new String("foo") == new String("foo") equal true?

But most of all do keep some patience with the lucee folk. It is entirely possible that there is no real way they can actually impl System.iHC due to the limitations of the platform they're working with, in which case there's not much they can do but commiserate and shrug.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72
  • 1
    I checked Lucee's source code and found out that it doesn't even reach `identityHashCode`. The ArrayLists are stored in a `Set` which calls `contains` to do some "clean up"(?), triggering ArrayList's `hashCode`. So there we go. Filing a bug for them. Thanks for your insight. – Alex Mar 04 '21 at 22:45