I watched a code from JavaDays, author said that this approach with probability is very effective for storing Strings like analogue to String intern method
public class CHMDeduplicator<T> {
private final int prob;
private final Map<T, T> map;
public CHMDeduplicator(double prob) {
this.prob = (int) (Integer.MIN_VALUE + prob * (1L << 32));
this.map = new ConcurrentHashMap<>();
}
public T dedup(T t) {
if (ThreadLocalRandom.current().nextInt() > prob) {
return t;
}
T exist = map.putIfAbsent(t, t);
return (exist == null) ? t : exist;
}
}
Please, explain me, what is effect of probability in this line:
if (ThreadLocalRandom.current().nextInt() > prob) return t;
This is original presentation from Java Days https://shipilev.net/talks/jpoint-April2015-string-catechism.pdf (56th slide)