Switch between equality rules

Question

I have a class and different equivalence rules on it (different implementation of equals and hashCode). The data is generated in one process first, where one equivalence rule is applied, and then fed to the second process, where the other equivalence rule is applied. Particularly, I am doing a lot of map operations and equals and hashCode are called implicitly by the standard library (which I do not have control on). What do you think is the best way to achieve this? I have two solutions now:

Define two subclasses with different equals and hashCode. After process 1, do the conversion by initiating objects of the other subclass.
Introduce mutable states in the class to indicate which equivalence rule to apply.

So which one do you think is better or is there any other good solutions?

Are you defining `hashCode()` and `equals()` solely for the purposes of `Map` compatibility? — cheeken, Jan 20 '13 at 01:03
@cheeken `HashSet` is also used, but basically for this kind of collections. — Kane, Jan 20 '13 at 03:19

cheeken · Answer 1 · 2013-01-20T03:10:04.390

A solution that is perhaps more elegant would be a custom Map class that allows customization of both hashing and equality-evaluation.

trait MappingScheme[KEY_CLASS,VALUE_CLASS] implements Comparable[VALUE_CLASS] {
    def generateHash(key: KEY_CLASS): Int
    // Also imposes compare() definition from Comparator
}

class CustomSchemeMap[K,V](mappingScheme: MappingScheme[K,V]) implements Map[K,V] {
    // Implement Map methods; use mappingScheme to generate hashes and
    // perform equality checks
}

In your scenario, you would create two custom MappingSchemes and use them as appropriate in your CustomSchemeMaps. This approach is more performant that the solutions you suggest (no extra instance creation and you don't have to mutate your objects), but it also makes more logical sense and is easier to follow.

However, implementing a Map can be a tall order. If that seems out of reach, I would create simple adapter classes to wrap around your objects and feed those into the maps.

class KeyableAdapter1(o: OriginalClass) {
    override def hashCode() = o.hashCode + 10 // e.g.
    override def equals(that: Object) = o.stuff == that.stuff // e.g., after cast
    def get(): OriginalClass = o // To get it back out, if you need to
}

class KeyableAdapter2(o: OriginalClass) {
    override def hashCode() = o.hashCode ^ 10
    override def equals(that: Object) = o.otherStuff = that.otherStuff
    def get(): OriginalClass = o
}

// Later
myMap.put(new KeyableAdapter1(o1), stuff)
myOtherMap.put(new KeyableAdapter2(o1), moreStuff)

This is similar to the subclassing approach, except that you can get the original object back via get(), and it's easier to follow (at least to my mind).

score 1 · Answer 2 · answered Jan 20 '13 at 07:13

Define two subclasses with different equals and hashCode. After process 1, do the conversion by initiating objects of the other subclass.

This is correct, but these two classes don't semantically differ, I think. They would be just used in another case, although they represent the same.

Introduce mutable states in the class to indicate which equivalence rule to apply.

Never do it, this is broken:

If you change the state globally, it is a big magic, that might cause lots of problems, especially if you use these classes in more threads. You can break some existing maps and so on.
If you change it locally, it is less magic, but you almost surely violate symetry in equals and hashCode contract, i.e. for all objects o1 and 2o, o1.equals(o2) implies o2.equals(o1). You may compare comparators (e.g. [1]), it will at least keep the contract. Although it keeps the comtract, it is ugly.

[1]

def equals(o: Object) = o match {
    case that: MyClass => 
        (that.comparator == this.comparator) && comparator.compare(this, that)
    case _ => false // for null values and other classes
}

Edmondo · Answer 3 · 2013-01-20T11:26:42.833

This is an improvement of the first proposed solution by @cheeken. I warmly suggest not to adopt the second, unless you are working on a trivial project. With the second approach you can't enforce that all the items you put in the map have their hash computed with the same hasher and this might lead to wrong and unexpected behaviour which is hard to explain at runtime.

The right way to go is to take ispiration from the HashMap inside Scala library:

@SerialVersionUID(2L)
class HashMap[A, +B] extends Map[A,B] with MapLike[A, B, HashMap[A, B]] with Serializable with CustomParallelizable[(A, B), ParHashMap[A, B]] {

  override def size: Int = 0

  override def empty = HashMap.empty[A, B]

  def iterator: Iterator[(A,B)] = Iterator.empty

  override def foreach[U](f: ((A, B)) =>  U): Unit = { }

  def get(key: A): Option[B] =
    get0(key, computeHash(key), 0)

  override def updated [B1 >: B] (key: A, value: B1): HashMap[A, B1] =
    updated0(key, computeHash(key), 0, value, null, null)

  override def + [B1 >: B] (kv: (A, B1)): HashMap[A, B1] =
    updated0(kv._1, computeHash(kv._1), 0, kv._2, kv, null)

  override def + [B1 >: B] (elem1: (A, B1), elem2: (A, B1), elems: (A, B1) *): HashMap[A, B1] =
    this + elem1 + elem2 ++ elems
    // TODO: optimize (might be able to use mutable updates)

  def - (key: A): HashMap[A, B] =
    removed0(key, computeHash(key), 0)

  protected def elemHashCode(key: A) = key.##

  protected final def improve(hcode: Int) = {
    var h: Int = hcode + ~(hcode << 9)
    h = h ^ (h >>> 14)
    h = h + (h << 4)
    h ^ (h >>> 10)
  }

  private[collection] def computeHash(key: A) = improve(elemHashCode(key))

  protected type Merger[B1] = ((A, B1), (A, B1)) => (A, B1)

  private[collection] def get0(key: A, hash: Int, level: Int): Option[B] = None

  private[collection] def updated0[B1 >: B](key: A, hash: Int, level: Int, value: B1, kv: (A, B1), merger: Merger[B1]): HashMap[A, B1] = 
    new HashMap.HashMap1(key, hash, value, kv)

  protected def removed0(key: A, hash: Int, level: Int): HashMap[A, B] = this

  protected def writeReplace(): AnyRef = new HashMap.SerializationProxy(this)

  def split: Seq[HashMap[A, B]] = Seq(this)

  def merge[B1 >: B](that: HashMap[A, B1], merger: Merger[B1] = null): HashMap[A, B1] = merge0(that, 0, merger)

  protected def merge0[B1 >: B](that: HashMap[A, B1], level: Int, merger: Merger[B1]): HashMap[A, B1] = that

  override def par = ParHashMap.fromTrie(this)

}

If you look, you can just write the following class:

class CustomHashMap[A,+B](val hashCalculator:HashCalculator[A]) extends HashMap[A,B] {
    //protected def elemHashCode(key: A) = key.## 
    override def elemHashCode(key: A) = hashCalculator(key)
}

You have to be sure that you all the public methods behave correctly, including par (you need to implement a parallel hash map that uses your special hasher) and merge, as well as the empty, which should not return HashMap.empty[A,B] but CustomHashMap.empty[A,B]

You only address the hash code part, what about `equals`? Also you say "With the second approach you can't enforce that all the items you put in the map have their hash computed with the same hasher" and I can't quite follow you. Surely in a `HashMap[KeyableAdapter1, T]` you are pretty much sure that the hash code used comes from `KeyableAdapter1`, right? — Régis Jean-Gilles, Jan 20 '13 at 10:30
The HashMap[A,B] does not change its type parameters, but gets a custom hashCalculator as an immutable property. You don't want to change the HashMap[A,B] into a HashMap[KeyableAdapter1,T] or iteration, mapping and filtering will change semantic as well — Edmondo, Jan 20 '13 at 11:27
And the HashCalculator can be HashAndEqualsCalculator if you prefer — Edmondo, Jan 20 '13 at 11:50
"And the HashCalculator can be HashAndEqualsCalculator if you prefer" -> but you have shown where to put hashCalculator, which is fine, but where do you put the call to your custom equals? Just renaming HashCalculator into HashAndEqualsCalculator won't magically make the HashMap use your custom equals. "You don't want to change the HashMap[A,B] into a HashMap[KeyableAdapter1,T] or iteration, mapping and filtering will change semantic as well" -> except that explicitly wrapping the values into KeyableAdapter1 was precisely what @cheeken was talking about in his second alternative. — Régis Jean-Gilles, Jan 20 '13 at 12:23
you are right I misinterpreted the question and also @cheeken answer — Edmondo, Jan 20 '13 at 12:28

score 0 · Accepted Answer · answered Jan 25 '13 at 18:54

Finally I find that writing my own customized Map is the way to go (at least in my problem). After I dig into the scala standard library for a while, I figure out that it is extremely easy. No matter whether mutable or not, the element equality and hashCode methods in HashMap are inherited from HashTable and HashTable.Utils and are protected, meaning any subclass can override it easily. So the following is what I end up with:

trait Equility[T] {
  def equal(t1: T, t2: T): Boolean
  def hash(t: T): Int
}

class MapWithEquility[K, V](e: Equility[K]) extends scala.collection.mutable.HashMap[K, V] {
  override def elemHashCode(key: K) = e.hash(key)
  override def elemEquals(key1: K, key2: K) = e.equal(key1, key2)
}

I did a simple test and it worked well.

Switch between equality rules

4 Answers4