18

I have a class for which equality (as per equals()) must be defined by the object identity, i.e. this == other.

I want to implement Comparable to order such objects (say by some getName() property). To be consistent with equals(), compareTo() must not return 0, even if two objects have the same name.

Is there a way to compare object identities in the sense of compareTo? I could compare System.identityHashCode(o), but that would still return 0 in case of hash collisions.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Balz Guenat
  • 1,552
  • 2
  • 15
  • 35
  • *compareTo() must not return 0, even if two objects have the same name.*, This will break the contract of compareTo, and thus such a `compareTo` would be unusable – Lino Apr 01 '19 at 08:18
  • 4
    Which clause of the contract would be broken exactly? – Balz Guenat Apr 01 '19 at 08:21
  • Did you mean you want when `A` and `B` have `getName()` equal, you don't care whether `A > B` or `B > A`, but you want consistent result i.e. if for once `A > B` is true, then `A > B` is always true throughout the whole process lifetime? – Ricky Mo Apr 01 '19 at 08:25
  • @RickyMo right. – Balz Guenat Apr 01 '19 at 08:31
  • Can you randomly pick a result and maintain it using a global variable to remember it ? If this way do not mess up your application a lot. – Ricky Mo Apr 01 '19 at 08:41
  • This would require synchronization in a multithreaded application and would probably also prevent those objects to be garbage collected. – Balz Guenat Apr 01 '19 at 08:47
  • 3
    @Lino: [It is strongly recommended, but not strictly required that (x.compareTo(y)==0) == (x.equals(y))](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Comparable.html#compareTo(T)) – Haem Apr 01 '19 at 09:40
  • Maybe, a different thought: what is the *actual* problem you are trying to solve? I get the feeling we are chasing an XY problem here. – GhostCat Apr 03 '19 at 03:01
  • @GhostCat I got a list of things that I want to order. Among the things are duplicates that I need to keep. – Balz Guenat Apr 03 '19 at 06:40

7 Answers7

31

I think the real answer here is: don't implement Comparable then. Implementing this interface implies that your objects have a natural order. Things that are "equal" should be in the same place when you follow up that thought.

If at all, you should use a custom comparator ... but even that doesn't make much sense. If the thing that defines a < b ... is not allowed to give you a == b (when a and b are "equal" according to your < relation), then the whole approach of comparing is broken for your use case.

In other words: just because you can put code into a class that "somehow" results in what you want ... doesn't make it a good idea to do so.

GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • 10
    A comparator seems like a non-broken solution. Even if the expectation of `Comparable` is that `compare==0` only when the two objects are equal, a *comparator* does not have that expectation. – khelwood Apr 01 '19 at 08:23
  • 2
    Hm yes, I think what's really happening is that my objects really only have a partial ordering and I'm trying to force a total ordering using object identity. When I want to sort these objects, I can use a separate Comparator where it is not an issue if two different objects are "equal". – Balz Guenat Apr 01 '19 at 08:38
  • @BalzGuenat That sounds like a reasonable approach. Especially given the fact that this will avoid the headaches you get when thinking how to properly implement a reasonable hashcode for your class ;-) – GhostCat Apr 01 '19 at 08:46
  • Just a comment: "If the thing that defines a < b ... is not allowed to give you a == b, then the whole approach of comparing is broken for your use case." I disagree. This is not much different than other two-tier comparisons like e.g. comparing cars by make first, then model. The only difference is how the second property comes into existence. – Balz Guenat Apr 01 '19 at 10:24
  • @BalzGuenat Well, but when they are < on one property, then having !0 when that property gives you == ... that is where things became doubtful. – GhostCat Apr 01 '19 at 10:37
  • @GhostCat I don't think things become doubtful. The Comparator builder methods are even based on the concept of chained comparators. They compare one property and, if that comparison yields 0, compare another. To illustrate using my car example, a comparator would look like so: `Comparator.comparing(Car::getMake).thenComparing(Car::getModel)` – Balz Guenat Apr 01 '19 at 12:11
  • 1
    @BalzGuenat But then you define your order to be TWO orders. What you are asking for ... isn't that different. Like merging an "order" with ... well, something that is not? – GhostCat Apr 01 '19 at 12:15
  • Well, that was kind of the question from the beginning: Is object identity comparable and if yes, how? If object identity could be represented as a number or string, then clearly, it would be comparable (regardless of whether that is a good idea). But it seems (from this thread) that Java does not give access to such a representation, so it's not possible. – Balz Guenat Apr 01 '19 at 12:36
  • 1
    @Balz Your problem is inherently different from chained comparators. Chained comparators are independent of each other, particularly the result of the first does not depend on the latter. If you want to make your `equals` consistent with `compareTo` you'll have to violate one of the two contracts (there is btw. no requirement for equals being consistent with compareTo so that's your escape hatch if you wanted to - bad idea though). Even if you could represent object identity as a long you couldn't get around that. – Voo Apr 01 '19 at 14:59
  • @Voo "Your problem is inherently different from chained comparators." I don't yet see how. Let's assume for a moment there was a method `Integer identity()` on `Object`. Then I could do `Comparator.comparing(Foo::getName).thenComparing(Object::identity)`. Which contract would be broken here? – Balz Guenat Apr 02 '19 at 12:26
6

By definition, by assigning each object a Universally unique identifier (UUID) (or a Globally unique identifier, (GUID)) as it's identity property, the UUID is comparable, and consistent with equals. Java already has a UUID class, and once generated, you can just use the string representation for persistence. The dedicated property will also insure that the identity is stable across versions/threads/machines. You could also just use an incrementing ID if you have a method of insuring everything gets a unique ID, but using a standard UUID implementation will protect you from issues from set merges and parallel systems generating data at the same time.

If you use anything else for the comparable, that means that it is comparable in a way separate from its identity/value. So you will need to define what comparable means for this object, and document that. For example, people are comparable by name, DOB, height, or a combination by order of precedence; most naturally by name as a convention (for easier lookup by humans) which is separate from if two people are the same person. You will also have to accept that compareto and equals are disjoint because they are based on different things.

Tezra
  • 8,463
  • 3
  • 31
  • 68
  • You could explain where to get that UUID, and why collision isn't a (practical) problem. – Yakk - Adam Nevraumont Apr 01 '19 at 15:43
  • @Yakk-AdamNevraumont Added links and expanded a little. I didn't go into detail of the UUID because I want to stress more the "unique id" part. UUID is just an implementation provided by Java. – Tezra Apr 01 '19 at 16:06
5

You could add a second property (say int id or long id) which would be unique for each instance of your class (you can have a static counter variable and use it to initialize the id in your constructor).

Then your compareTo method can first compare the names, and if the names are equal, compare the ids.

Since each instance has a different id, compareTo will never return 0.

Eran
  • 387,369
  • 54
  • 702
  • 768
  • 1
    This sounds simple at first but becomes tricky once multiple threads create objects and the counter has to be synchronized. It would also complicate things if you wanted to (de)serialize objects, as the ID would have to be regenerated. – Balz Guenat Apr 01 '19 at 08:20
  • @BalzGuenat yes, the counter would have to be synchronized to support multiple threads. And as for (de)serialization, the `id` would have to be serialized. – Eran Apr 01 '19 at 08:37
  • 1
    "the id would have to be serialized." Well, no, because then you'd end up with multiple objects with the same ID. Say execution 1 of the program will create an object with ID 42, serializes it and stores it on disk. Execution 2 then creates its own object with ID 42 but also deserializes the object from execution 1, resulting in two objects with the same ID. There are solutions to this of course, but it gets way too complicated for the task at hand. – Balz Guenat Apr 01 '19 at 08:45
  • 1
    The second property can be the hashCode. And with using the hashCode, I think this is the correct answer as it implements an order relation that is consistent with the equals implementation. – kutschkem Apr 01 '19 at 10:07
  • @kutschkem not if you want it to be unique. `hashCode`s are not unique. – Eran Apr 01 '19 at 10:08
  • @Eran How likely is a hash collision for the identityHashCode? And won't an own counter also have the problem in principal (an int/long can wrap around after a while...)? – kutschkem Apr 01 '19 at 10:09
  • @kutschkem assuming the counter is reset when the JVM starts, it is not very likely a long counter will wrap around. `Long.MAX_VALUE` is a large number. I don't know how likely a hash collision for identityHashCode is. – Eran Apr 01 '19 at 10:12
  • @Eran I at first thought you were right about hash collisions, but the answer to this bug entry leads me to believe two live objects should (never?) collide (although this is explicitly not guaranteed by the specification): https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6809470 – kutschkem Apr 01 '19 at 10:20
  • Use a UUID instead of counter ID :p – Koekje Apr 01 '19 at 13:50
  • @kutschkem Clearly that claim in the bug report is wrong "Identity hashcode values are based upon heap pointer address". Just think about this for a second: If this were true, objects would get a new hashcode every time the GC moves them around. In reality the hashcode is stored in the object header. I think the original value is based on the heap address, but you already see the problem: If an object is moved, a new object could be allocated at the same spot and that heap address be used for its hash again leading to multiple live objects with the same identityhashcode. – Voo Apr 01 '19 at 15:06
  • (In retrospective "the claim is wrong" is incorrect - I mean hashcodes are based on heap addresses, it's just that heap addresses are not constant so the conclusion was wrong). – Voo Apr 01 '19 at 15:15
  • @BalzGuenat: If needed for performance reasons, it would be possible to allocate blocks of IDs to each thread so that they can safely create multiple objects between each entry into a synchronized section. But you're right, doing this safely and efficiently quickly gets rather complicated. – Ilmari Karonen Apr 01 '19 at 21:58
1

While I stick by my original answer that you should use a UUID property for a stable and consistent compare / equality setup, I figured I'd go ahead an answer the question of "how far could you go if you were REALLY paranoid and wanted a guaranteed unique identity for comparable".

Basically, in short if you don't trust UUID uniqueness or identity uniqueness, just use as many UUIDs as it takes to prove god is actively conspiring against you. (Note that while not technically guaranteed not to throw an exception, needing 2 UUID should be overkill in any sane universe.)

import java.time.Instant;
import java.util.ArrayList;
import java.util.UUID;

public class Test implements Comparable<Test>{

    private final UUID antiCollisionProp = UUID.randomUUID();
    private final ArrayList<UUID> antiuniverseProp = new ArrayList<UUID>();

    private UUID getParanoiaLevelId(int i) {
        while(antiuniverseProp.size() < i) {
            antiuniverseProp.add(UUID.randomUUID());
        }

        return antiuniverseProp.get(i);
    }

    @Override
    public int compareTo(Test o) {
        if(this == o)
            return 0;

        int temp = System.identityHashCode(this) - System.identityHashCode(o);
        if(temp != 0)
            return temp;

        //If the universe hates you
        temp = this.antiCollisionProp.compareTo(o.antiCollisionProp);
        if(temp != 0)
            return temp;

        //If the universe is activly out to get you
        temp = System.identityHashCode(this.antiCollisionProp) - System.identityHashCode(o.antiCollisionProp);;
        if(temp != 0)
            return temp;

        for(int i = 0; i < Integer.MAX_VALUE; i++) {
            UUID id1 = this.getParanoiaLevelId(i);
            UUID id2 = o.getParanoiaLevelId(i);
            temp = id1.compareTo(id2);
            if(temp != 0)
                return temp;

            temp = System.identityHashCode(id1) - System.identityHashCode(id2);;
            if(temp != 0)
                return temp;
        }

        // If you reach this point, I have no idea what you did to deserve this
        throw new IllegalStateException("RAGNAROK HAS COME! THE MIDGARD SERPENT AWAKENS!");
    }

}
Tezra
  • 8,463
  • 3
  • 31
  • 68
0

Assuming that with two objects with same name, if equals() returns false then compareTo() should not return 0. If this is what you want to do then following can help:

  • Override hashcode() and make sure it doesn't rely solely on name
  • Implement compareTo() as follows:
public void compareTo(MyObject object) {
    this.equals(object) ? this.hashcode() - object.hashcode() : this.getName().compareTo(object.getName());
}
Darshan Mehta
  • 30,102
  • 11
  • 68
  • 102
  • `Object.hashCode` has the same problem as `System.identityHashCode` in that in case of a collision, compareTo will return 0 for two different objects. – Balz Guenat Apr 01 '19 at 08:22
  • That is the reason why I advised overriding `hashcode()` and make it rely not just on name. – Darshan Mehta Apr 01 '19 at 08:26
  • any hash function, by definition, will have collisions, regardless of whether it relies on just the name or not. – Balz Guenat Apr 01 '19 at 08:33
0

You are having unique objects, but as Eran said you may need an extra counter/rehash code for any collisions.

private static Set<Pair<C, C> collisions = ...;

@Override
public boolean equals(C other) {
    return this == other;
}

@Override
public int compareTo(C other) {
    ...
    if (this == other) {
        return 0
    }
    if (super.equals(other)) {
        // Some stable order would be fine:
        // return either -1 or 1
        if (collisions.contains(new Pair(other, this)) {
            return 1;
        } else if (!collisions.contains(new Pair(this, other)) {
            collisions.add(new Par(this, other));
        }
        return 1;
    }
    ...
}

So go with the answer of Eran or put the requirement as such in question.

  • One might consider the overhead of non-identical 0 comparisons neglectable.
  • One might look into ideal hash functions, if at some point of time no longer instances are created. This implies you have a collection of all instances.
Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • 1
    This is basically a resource leak; anything that every collided has its lifetime extended to the end of the program. Thanks however, I can use this as a good example of why garbage collection doesn't imply no resource leaks. – Yakk - Adam Nevraumont Apr 01 '19 at 15:44
  • @Yakk-AdamNevraumont true, just the mention of `static` should trigger an immediate nausea. "So go with the answer of Eran". The code was just a temptative argument. There is something fishy here, Eran's solution is probably also not what is actually desired. Does the OP want an ideal hash function? – Joop Eggen Apr 01 '19 at 15:55
  • 1
    Contrary to all other solutions this can at least be made to work correctly without violating either equals or compareTo contracts. Although if your solution involves WeakHashMaps you already have much bigger problems. – Voo Apr 01 '19 at 16:31
  • @Voo but compareTo only returns 0 or 1... Is the second 1 supposed to be -1 or is there a return -1 missing? Also would probably be more efficient to just remove static and just use system.idenity on that object to resolve collisions. If you have 2 collisions, that should prove god hates you. =P – Tezra Apr 02 '19 at 19:18
  • @Tezra "Made to work", agreed that the current solution is not going to be correct. – Voo Apr 02 '19 at 19:27
0

There are times (although rare) when it is necessary to implement an identity-based compareTo override. In my case, I was implementing java.util.concurrent.Delayed.

Since the JDK also implements this class, I thought I would share the JDK's solution, which uses an atomically incrementing sequence number. Here is a snippet from ScheduledThreadPoolExecutor (slightly modified for clarity):


    /**
     * Sequence number to break scheduling ties, and in turn to
     * guarantee FIFO order among tied entries.
     */
    private static final AtomicLong sequencer = new AtomicLong();

    private class ScheduledFutureTask<V>
            extends FutureTask<V> implements RunnableScheduledFuture<V> {

        /** Sequence number to break ties FIFO */
        private final long sequenceNumber = sequencer.getAndIncrement();
        
    }

If the other fields used in compareTo are exhausted, this sequenceNumber value is used to break ties. The range of a 64bit integer (long) is sufficiently large to count on this.

A248
  • 690
  • 7
  • 17