2

I have a class being heavily used in Sets and Dictionaries. For performance reasons this class implements Hashable in a old way and caches the computed hash:

let hashValue: Int

init(...) {
    self.hashValue = ...
}

In Xcode 10.2 I see a warning, that hashValue is deprected and will stop being a protocol requirement soon.

What bothers me is a lack of ability to cache the computed hash anyhow, because hash(into:) does not return anything.

func hash(into hasher: inout Hasher) {
    hasher.combine(...)
}

Consider the following example in a playground

class Class: Hashable {
    let param: Int

    init(param: Int) {
        self.param = param
    }

    static func ==(lhs: Class, rhs: Class) -> Bool {
        return lhs.param == rhs.param
    }

    public func hash(into hasher: inout Hasher) {
        print("in hash")
        hasher.combine(param)
    }
}

var dict = [Class: Int]()
let instance = Class(param: 1)
dict[instance] = 1
dict[instance] = 2

You will see the following logs

in hash
in hash
in hash

I have no idea, why we see 3 calls instead of 2, but we do =).

So, every time you use a same instance as a dictionary key or add this instance into a set, you get a new hash(into:) call.

In my code such an overhead turned out to be very expensive. Does anyone know a workaround?

Tim
  • 1,877
  • 19
  • 27

1 Answers1

6

One option is to create your own Hasher, feed it the "essential components" of your instance and then call finalize() in order to get out an Int hash value, which can be cached.

For example:

class C : Hashable {
  let param: Int

  private lazy var cachedHashValue: Int = {
    var hasher = Hasher()
    hasher.combine(param)
    // ... repeat for other "essential components"
    return hasher.finalize()
  }()

  init(param: Int) {
    self.param = param
  }

  static func ==(lhs: C, rhs: C) -> Bool {
    return lhs.param == rhs.param
  }

  public func hash(into hasher: inout Hasher) {
    hasher.combine(cachedHashValue)
  }
}

A couple of things to note about this:

  • It relies on your "essential components" being immutable, otherwise a new hash value would need calculating upon mutation.
  • Hash values aren't guaranteed to remain stable across executions of the program, so don't serialise cachedHashValue.

Obviously in the case of storing a single Int this won't be all that effective, but for more expensive instances this could well help improve performance.

Hamish
  • 78,605
  • 19
  • 187
  • 280
  • Looks like a hack =)) – Tim Aug 02 '19 at 20:03
  • I will measure it and share a report a bit later =) – Tim Aug 02 '19 at 20:03
  • 1
    I wouldn't really call it a hack given this is how `Hasher` is supposed to be used. I'll admit it's a bit more convoluted than how you would have done it with `hashValue`, but, as Martin says, the new API allows for much more robust hash values. – Hamish Aug 02 '19 at 20:14