9

I previously come to the conclusion that if you need a SoftReference with value (equals) based equality then one had a bad design, excepting an interner from this. This is following Google Collections and Guava not including such a class. But I've come across an issue that I think could use such an object.

We have an asset management system in a visual effects render farm with 100's of processes running the same job that only differ in the frame number it renders. We have an Oracle database that needs to record all the assets used. Instead of pounding Oracle with identical inserts where only one will succeed from all the jobs, in the middle-tier asset management system we can use a HashSet to record if the object that would be inserted into Oracle.

I could use a Google MapMaker with an expiration, but I don't want to have to worry about getting the expiration correct, we have renders that run in hours and some over days. Using a SoftReference with equals equality sounds like a much better way so the JVM will manage garbage collection automatically.

For other problems that I want to solve with a ConcurrentHashMap with garbage collection, I would use a strong reference in the HashMap as the key to get equals() equality and a SoftReference as the value so the JVM can garbage collect something, but in this case, the value doesn't matter and I don't have a value to wrap in a SoftReference to put there. So it seems like using a SoftReference with equals() would do the trick.

Any other suggestions on this?

Blair Zajac
  • 4,555
  • 3
  • 25
  • 36
  • 1
    Love your question, I've been wondering about that also recently – nanda Feb 12 '10 at 07:58
  • Doesn't `ResourceBundle` do something like this? – Tom Hawtin - tackline Feb 12 '10 at 08:22
  • @nanda what gets added to Oracle is a list of assets (say filenames on an NFS server) generated dynamically; ResourceBundle doesn't seem to be the right fit. I just need a HashSet to record that the filename got recorded in Oracle so another 99 attempts to insert it don't waste CPU cycles in Oracle. – Blair Zajac Feb 12 '10 at 15:53
  • 2
    Since this stuff is really complicated, I could really benefit from more specifics: What is your key type? What is the value type? Approximately what do these types look like? Do you want soft keys, soft values, or both, and why? Would you still need this feature once `MapMaker` supports other eviction policies that respect a specific size limit (e.g., LRU, though what we're doing is not exactly LRU). And if multiple equal instances can exist, why does it make sense to clean up an entry whenever any single *one* of them gets GC'd? Another one might be just on the verge of being queried. – Kevin Bourrillion Feb 12 '10 at 17:17
  • Data entered into Oracle includes the project name, folder path, asset name, version number and representation name. I have a POJO, RepLookupEntry, with those as fields. The data goes into Oracle if a HashSet doesn't contain the RepLookupEntry. Since I don't have a concurrent HashSet I would use a ConcurrentHashMap. I need soft keys so the GC will evict them, but the keys must compare using equals(). I could use the same SoftReference as the HashMap value, I don't have anything else to put in there. All other RepLookupEntry's in the process are transient, created by the client as an RPC. – Blair Zajac Feb 13 '10 at 06:40
  • BTW, does this discussion better belong on the Guava mailing list? – Blair Zajac Feb 13 '10 at 06:41

3 Answers3

1

In most cases when you want to use soft references with Google Collections, you should call

MapMaker.softValues()

With strong keys but soft values, lookups will use equality and key-value pairs will be garbage collected when memory is tight.

Jared Levy
  • 1,986
  • 13
  • 12
  • But I don't have a value to associate with the key, so the key is the only thing that can be put in a soft reference. – Blair Zajac Feb 15 '10 at 12:18
  • @BlairZajac Realize this is ancient but wouldn't `new Object()` work as the soft value, to cause the (non-weak keyed) entry to be removed due to memory pressure? – Partly Cloudy Aug 23 '23 at 10:28
1

Since there is no ConcurrentHashSet using soft references, there are only two approaches:

1.) Your approach with the ConcurrentHashMap

  • Override equals and hashCode in the SoftReference
  • Inside of equals and hashCode only access the object using SoftReference#get
  • Put SoftReference as key, and any object as value (only null is not permitted)
  • If the Reference goes stale while accessing hashCode or equals, add the reference to a deletion queue to frequently remove the keys which are dead.
  • Check for contains via containsKey

2.) Use a ConcurrentMultimap<Integer, Set<SoftReference<RepLookupEntry>> and use hashCode as key, and a synchronized set of SoftReferences as values. When you get a hashCode hit, then check the contents of all SoftReferences for equality. Not very pretty, I agree and tricky to synchronize.

If I were in your position, I would not use SoftReferences at all, but rather a ConcurrentHashMap to keep strong references to your POJOs. Each time a new element arrives also put it in a ConcurrentLinkQueue. If the queue grows beyond a certain limit start removing elements from the HashMap.

Christopher Oezbek
  • 23,994
  • 6
  • 61
  • 85
0

I think that this class will meet your need:

import java.util.*;
import java.lang.ref.*;

public class SoftSet<T> extends AbstractSet<T> {

  private final WeakHashMap<T,SoftReference<T>> data = new WeakHashMap<T,SoftReference<T>>();

  public boolean add(T t) {
    return null == data.put(t, new SoftReference<T>(t));
  }

  public boolean remove(Object o) {
    return null != data.remove(o);
  }

  public boolean contains(Object o) {
    return data.containsKey(o);
  }

  public Iterator<T> iterator() {
    return data.keySet().iterator();
  }

  public int size() {
    return data.size();
  }

  public void clear() {
    data.clear();
  }

  public boolean removeAll(Collection<?> c) {
    return data.keySet().removeAll(c);
  }

  public boolean retainAll(Collection<?> c) {
    return data.keySet().retainAll(c);
  }
}

The way that this should work is that once the soft reference that is the value is cleared, then the value is weakly reachable only and the key can be removed from the inner map.

Geoff Reedy
  • 34,891
  • 3
  • 56
  • 79
  • Maybe the downvote because it's wrapping the object in two separate references than it needs to. Having a single SoftReference sublass with equals() equality in a Google ConcurrentHashMap may be cleaner this way. – Blair Zajac Feb 19 '10 at 22:42