There's no limit to the number of objects a program can create, call GetHashCode()
upon, and abandon. There is, however, a limit of 4,294,967,296 different values GetHashCode()
can return. If a program happens to call GetHashCode
4,294,967,297 times, at least one of those calls would have to return a value that had already been returned previously.
It would theoretically be possible for the system to keep a pool of hash-code values, and for objects which are abandoned to have their hash codes put back in the pool so that GetHashCode()
could guarantee that it will never return the same value as any other live object (assuming there are no more than 4,294,967,296 live objects, at least). On the other hand, keeping such information would be expensive and not really offer much benefit. From a practical perspective, it's just as good to have the system generate an arbitrary number either when an object is constructed or the first time GetHashCode()
is called upon it. There will be occasional collisions, but generally not enough to bother well-written code.
BTW, I've sometimes thought it would be useful for each object to have a 64-bit ID which would be guaranteed unique, and which would also rank objects in order of creation. A 64-bit ID would never overflow within the lifetime of any foreseeable program, and being able to assign objects a ranking could be helpful in some caching or interning scenarios. For example, if a program generates some large objects by reading data from files, and frequently scans them to find differences, it may often find objects that contain identical data but are distinct. If two distinct objects are found to be identical and interchangeable, replacing reference to the newer one with the older one may considerably expedite future comparisons among them; if many matching objects are compared among each other, many of the references to newer objects will get replaced with references to the oldest ones, without having to explicitly cache anything. Absent some means of determining "age", however, such an approaches wouldn't really work, since there would be no way to know which reference should be abandoned in favor of the other.