3

Inside our system we present session of client as class Session.

Historically hashcode of this class is mutable - it is 0 on creation and change to user id at some point of time.

There are existed two session managers in the system.

  1. Client Sessions Manager - hold active client session is the client. Inside of this manager is simple ConcurrentHashMap< Long, Session > where key is user id.
  2. Memory Sessions Manager - hold sessions which are not yet collected by GC. Inside there are WeakHashMap < Session, Long > (value here is user id)

When client connects and logged in next flow happens:

  1. Client session is created.
  2. Session put into Memory Sessions Manager (hashcode at this point is 0 so they are put into single bucket)
  3. Client is logged in and session put into Client Sessions Manager with correct hash code.
  4. Client session is closed and removed from Client Sessions Manager.

As result after collection of all strong references on Session object it must be removed from WeakHashMap. (An entry in a WeakHashMap will automatically be removed when its key is no longer in ordinary use. More precisely, the presence of a mapping for a given key will not prevent the key from being discarded by the garbage collector, that is, made finalizable, finalized, and then reclaimed.) But for some reason they are stayed there. Code of it below:

public class MemorySessionManager implements Runnable {
        private static final Logger logger = LoggerFactory.getLogger("checker");

        private Map<Session, Long> sessions;

        public MemorySessionManager() {
            this.sessions = new WeakHashMap<>();
        }

        public synchronized void addSession(Session sess) {
            sessions.put(sess, sess.getId());
        }

        public void run() {
            Set<Session> sessionsToCheck = new HashSet<>();
            synchronized (this) {
                sessionsToCheck.addAll(sessions.keySet());
            }
            for (Session sess : sessionsToCheck) {
                logger.warn("MemorySessionManager: Is still here: " + sess);
            }
        }
   }

Simplified program flow and session class (without useless info).

import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

public class ClientSessionManager {

    private Map<Long, Session> sessions;

    public ClientSessionManager() {
       this.sessions = new ConcurrentHashMap<>();
    }

    public void addSession(Session session) {
       sessions.put(session.getUserId(), session);
    }

    public Session removeSession(long code) {
       return sessions.remove(code);
    }
}

public class Session {

    private long userId;

    public void setUserId(long userId) {
        this.userId = userId;
    }

    public long getUserId() {
        return userId;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o)
           return true;
        if (o == null || getClass() != o.getClass())
           return false;

        Session session = (Session) o;

        return userId == session.userId;

   }

   @Override
   public int hashCode() {
       return (int) (userId ^ (userId >>> 32));
   }
}

public class Process {

    private static final ClientSessionManager CLIENT_SESSION_MANAGER = new ClientSessionManager();
    private static final MemorySessionManager MEMORY_SESSION_MANAGER = new MemorySessionManager();

    /**
     * Login user, create its session and register session into appropriate managers.
     * 
     * @param userId id of user
     */
    public void login(long userId) {
        Session session = new Session();
        MEMORY_SESSION_MANAGER.addSession(session);
        session.setUserId(userId);
        CLIENT_SESSION_MANAGER.addSession(session);
    }

    /**
     * Close session of user, remove it from session manager
     * 
     * @param userId id of user
     */
    public void close(long userId) {
        CLIENT_SESSION_MANAGER.removeSession(userId);
    }
}

After lots of GC cycles in remains in memory. Here is logs on GC (G1, mixed)

[GC pause (G1 Evacuation Pause) (mixed)
Desired survivor size 6815744 bytes, new threshold 4 (max 15)
- age   1:     236400 bytes,     236400 total
- age   2:     350240 bytes,     586640 total
- age   3:    3329024 bytes,    3915664 total
- age   4:    2926992 bytes,    6842656 total
, 0.0559520 secs]
   [Parallel Time: 51.9 ms, GC Workers: 2]
      [GC Worker Start (ms): Min: 73278041.7, Avg: 73278041.8, Max: 73278042.0, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 3.2, Avg: 3.4, Max: 3.7, Diff: 0.5, Sum: 6.9]
      [Update RS (ms): Min: 15.0, Avg: 15.0, Max: 15.0, Diff: 0.0, Sum: 30.1]
         [Processed Buffers: Min: 59, Avg: 65.0, Max: 71, Diff: 12, Sum: 130]
      [Scan RS (ms): Min: 13.6, Avg: 14.3, Max: 15.1, Diff: 1.5, Sum: 28.7]
      [Code Root Scanning (ms): Min: 0.4, Avg: 1.2, Max: 2.0, Diff: 1.5, Sum: 2.4]
      [Object Copy (ms): Min: 17.5, Avg: 17.7, Max: 17.8, Diff: 0.3, Sum: 35.4]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 2]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 51.6, Avg: 51.7, Max: 51.8, Diff: 0.2, Sum: 103.4]
      [GC Worker End (ms): Min: 73278093.6, Avg: 73278093.6, Max: 73278093.6, Diff: 0.0]
   [Code Root Fixup: 0.7 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.4 ms]
   [Other: 2.9 ms]
      [Choose CSet: 0.6 ms]
      [Ref Proc: 0.3 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.3 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.9 ms]
   [Eden: 95.0M(95.0M)->0.0B(1223.0M) Survivors: 7168.0K->5120.0K Heap: 512.0M(2048.0M)->363.5M(2048.0M)]
 [Times: user=0.11 sys=0.00, real=0.06 secs]
2016-09-28T10:40:00.815+0000: 73288.384: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 80740352 bytes, new threshold 15 (max 15)
- age   1:    1096960 bytes,    1096960 total
- age   2:     220208 bytes,    1317168 total
- age   3:     349352 bytes,    1666520 total
- age   4:    3325200 bytes,    4991720 tota

At pic below you can see path to root of problem session and you can see that only weak references chain exists.

enter image description here

Please help. Or at least give some suggestions.

Nicolas Filotto
  • 43,537
  • 11
  • 94
  • 122
Andre Minov
  • 325
  • 1
  • 3
  • 8
  • yes, this is legacy code and now changed. I posted initial version of it. But anyway I do not get objects by key - I select all keys always as you may see in code. – Andre Minov Sep 29 '16 at 08:15
  • 1
    There is no actual guarantee anywhere that GC will ever run at all, let alone do anything as specific as this. Code that relies on specific GC behaviour needs revision. Why is the reference chain so long? – user207421 Sep 29 '16 at 08:31
  • @EJP hashcode at the moment of insertion into map is 0 and all objects are put into same bucket - so it is just converted into linked list. Yes, code have to be changed ( but this is legacy one and problem have to be found) Yes, we rely on GC, but we expect nothing extraordinary except javadoc behavior of WeakReference. GC is run - I added logs of it. – Andre Minov Sep 29 '16 at 08:35
  • Please provide code to reproduce following the MCVE rule http://stackoverflow.com/help/mcve – Nicolas Filotto Sep 29 '16 at 08:35
  • @NicolasFilotto added, sorry. – Andre Minov Sep 29 '16 at 08:49
  • please also a `main`method that will do the steps to reproduce – Nicolas Filotto Sep 29 '16 at 09:06
  • @NicolasFilotto This is huge server app. Lets say we have rest API with login and logout. Login call Process.login and logout calls Process.close. Have I put all those stuff here? – Andre Minov Sep 29 '16 at 09:09
  • no unfortunately it is not enough, it could be enough for a very trivial question with an obvious answer but here it is obviously more complex. If you don't provide clear steps to reproduce, you could have no answer and you take the risk to see your question been closed. It is your job to provide clear steps to reproduce with as less code as possible – Nicolas Filotto Sep 29 '16 at 09:20
  • 2
    When you are repeatedly fetching all keys from the `WeakHashMap`, you are resurrecting the objects yourself. The garbage collection can only consider them unreachable if you are *not* iterating over them at the same time. – Holger Sep 29 '16 at 17:36
  • @Holger I'm sorry, I should have made more sense on my previous attempt, but not idea if I will now either. Are you talking about this part specifically? `for (Session sess : sessionsToCheck) { logger.warn("MemorySessionManager: Is still here: " + sess); }` that would create strong references in my understanding and thus prohibit GC to clear those – Eugene Oct 05 '18 at 14:30
  • @Eugene the preceding `sessionsToCheck.addAll(sessions.keySet())` makes all keys alive before iterating over the copy. Otherwise, the chances would be much smaller to have them all reachable when the garbage collector comes around. Just iterating over the map would make them alive only one at a time. But note that this will happen even when iterating over the values, as the underlying iterator would make each key alive temporarily, due to the fact that `next()` must be guaranteed to succeed after `hasNext()` has returned `true`. – Holger Oct 05 '18 at 14:53
  • @Holger I initially wanted to say : because of `sessions.keySet()` makes them all alive; but I thought that this is a *view* and somehow, unless you would start iterating, would not touch the keys at all. thank you! this being said I really don't know why you would not answer this question, I've read similar question here on SO, but none made me happy. I'm at a point where SO (unless of some users I truly respect), brings far less insights than garbage; as such good answers should be spread and linked and closed as duplicates - like this comment/answer. – Eugene Oct 05 '18 at 14:59
  • @Eugene I did not post it as an answer, because it’s only a possible scenario with some likelihood and there could have been be other issues as well. If the OP confirmed that this was the issue, I posted it as an answer. But the questioner never came back to confirm or rebut the hypothesis. – Holger Oct 08 '18 at 11:29

1 Answers1

0

As pointed in the comment by Holger, the following is wrong. I'm leaving it here temporarily so that others don't follow this wrong path.

Anyway, I'd recommend to get rid of the mutable hashCode and the whole MEMORY_SESSION_MANAGER (at the very least, change the userId before using the session as a key).

Wrong...

Mutable hashCode... that's bad, but acceptable as long it never changes after it gets used. Yet I can see

    Session session = new Session();
    MEMORY_SESSION_MANAGER.addSession(session);
    session.setUserId(userId);

You put it in a HashMap a a key and immediately change the hashCode. Now, the behavior is undefined, the map can throw or exercise an endless loop on any access.

If it was C/C++, it could eat your cat, too.

What really happens is that the key doesn't get found anymore. So it can't be removed, not by .remove(key) and probably also not in the cleanup. It can't be found so it can't be removed. That's it.. With an wrong hashCode a wrong slot gets searched.

As result after collection of all strong references on Session object it must be removed from WeakHashMap.

It must obey the contract only if you obey the contract. And you do not.

maaartinus
  • 44,714
  • 32
  • 161
  • 320
  • 1
    `expungeStaleEntries()` uses the hash code stored in the `Entry` instance, which is the hash code it had when it was added to the map. It can only work this way, as at the time, a `WeakReference` is enqueued, the object is unreachable, hence, can’t be queried for the hash code. So modifying hash codes after insertion still is a very bad idea, but not the cause of this specific problem. – Holger Sep 30 '16 at 08:56
  • @Holger **Damn. You're right. I'll think about it and delete my answer soon.** – maaartinus Sep 30 '16 at 10:57