3

I uses EHCache + JGroups to replicate the cache of my webapps on 3 tomcat instances.

<!-- Use jgroups (UDP) to replicate cache among the cluster -->
    <cacheManagerPeerProviderFactory
        class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"
properties="channelName=EH_CACHE_STA::connect=UDP(mcast_addr=229.10.10.10;mcast_port=45567;):PING:MERGE2:FD_SOCK:VERIFY_SUSPECT:pbcast.NAKACK:UNICAST:pbcast.STABLE:FRAG:pbcast.GMS"
        propertySeparator="::" />

Sometimes a tomcat instance don't restart. In the jgroups logs I can see :

[webapp] WARN  2012-12-14 15:36:55,784 [GMS] : join(tc-fr-sta-tomcat1-32427) sent to b0dc40aa-12aa-4045-01e4-c80b013dbb13 timed out (after 5000 ms), retrying
[webapp] WARN  2012-12-14 15:36:55,785 [UDP] : tc-fr-sta-tomcat1-32427: no physical address for b0dc40aa-12aa-4045-01e4-c80b013dbb13, dropping message

It seems the node try to join himself ???! We have to restart all tomcat in production to restore the cluster. Anybody can help me to resolve this issue ?

juliusdev
  • 763
  • 3
  • 8
  • 13

1 Answers1

2

Which version of JGroups is this running with (java -jar jgroups.jar) ? I recommend to run with the latest stable version. Also, set timer_type="old" in UDP.

In addition, it would be better if ehcache allowed for a JGroups config to be defined in an XML file, perhaps the latest version does this ? (I'm not an ehcache expert). Cheers, Bela

Bela Ban
  • 141
  • 1
  • Thanks, I'll try the timer_type="old" parameter in the UDP config. Here my jgroups version : Version: 2.10.0.GA CVS: $Id: Version.java,v 1.101 2010/07/12 11:34:27 belaban Exp $ – juliusdev Dec 17 '12 at 09:00
  • can help in this http://stackoverflow.com/questions/20568661/face-clustering-in-tomcat-6-on-multiple-machine – HybrisHelp Dec 16 '13 at 07:59