0

StackOverflow has been a way of life for me but this time its a question rather than looking for an answer as I have probably exhausted all options.

Apologies as this will a long description of the issue !

We have an Spring MVC application + Tomcat 7 running on windows 2012 server on AWS .Being a Analytics application invoking heavy duty procedure calls doing statistical calculations in the backed .

With a a high availability requirement I need to setup a cluster .Now with no multicasting on AWS I resorted to two other options .(I must say its my first foray into AWS and Tomcat in a production environment )

1.Static Tomcat Cluster with DeltaManager for session replication 2.Redis Based session replication (Will be a long shot with a windows server and with sticky session )

Starting with Static Tomcat Cluster ,which I did set up with out much fuss and went on to configure Apache Httpd mod_proxy as load balancer .

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" channelSendOptions="8" channelStartOptions="3"><!--startoption 3 added to disable 
    multicast ,channel send option 8 is for async replication-->
    <Manager className="org.apache.catalina.ha.session.DeltaManager"
               expireSessionsOnShutdown="false"
               notifyListenersOnReplication="true"/>
    <Channel className="org.apache.catalina.tribes.group.GroupChannel">
        <Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"
               address="auto"
               port="4002"
               autoBind="9"
               selectorTimeout="5000"
               maxThreads="6"/>
        <Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
            <Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
        </Sender>
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpPingInterceptor"/><!--Added ,This interceptor pings other nodes
        sothat all nodes can recognize when other nodes have left the cluster. Without this class, the cluster may appear to work fine, but session
        replication can break down when nodes are removed and re-introduced-->
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
        <Interceptor className="org.apache.catalina.tribes.group.interceptors.StaticMembershipInterceptor">
        <Member className="org.apache.catalina.tribes.membership.StaticMember"  
              port="4000"  
              host="localhost"  
              domain="delta-static"  
              uniqueId="{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,0}" />
        </Interceptor>  
    </Channel>

    <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" filter=""/>
    <Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>
    <ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
    <ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>

The mod proxy httpd.conf configuration with AJP connector with relevant modules uncommented

<Proxy balancer://IOCluster stickysession=JSESSIONID>
  BalancerMember ajp://127.0.0.1:8009  route=tcruntime8009 loadfactor=1  
  BalancerMember ajp://127.0.0.1:8012  route=tcruntime8012 loadfactor=1 
</Proxy>
ProxyPreserveHost On
ProxyStatus On

ProxyPass "/IO"  "balancer://IOCluster/IO"
ProxyPassReverse "/IO"  "balancer://IOCluster/IO"

The mod proxy httpd.conf configuration with HTTP connector with relevant modules uncommented

ProxyRequests Off ProxyPass /IO balancer://IOCluster stickysession=JSESSIONID ProxyPassReverse /IO balancer://IOCluster
BalancerMember http://localhost:8092/IO route=tcruntime8092 BalancerMember http://localhost:8091/IO route=tcruntime8091

The load balancer worked in both the cases .The issue was with session replication which wasn't working and I could see no sign of the same in logs .If I shut down one instance the balancer would redirect to the other node but I would see the login page ,which was proof of the same .

As per this 18835014 question I added the tag to the applications web.xml and moved the delta manager tag to context.xml

<Context>

   <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>
    <!-- Default set of monitored resources -->


    <WatchedResource>WEB-INF/web.xml</WatchedResource>
    <!--<Context distributable="true"></Context>-->
    <!-- Uncomment this to disable session persistence across Tomcat restarts -->
    <!--
    <Manager pathname="" />
    -->

    <!-- Uncomment this to enable Comet connection tacking (provides events
         on session expiration as well as webapp lifecycle) -->
    <!--
    <Valve className="org.apache.catalina.valves.CometConnectionManagerValve" />
    -->

</Context>

And I could see session replication active on the console .

The issue is now when I log into the application the the page becomes unresponsive despite the queries fired on the application !I can see 504(Gateway timed out )message on the access logs where I see all the get request return successfully .But as soon as the first queries are fired after submit the login page the database queries fire but the application becomes unresponsive .

If I move back the DeltaManager back to inside server.xml the application becomes responsive but without session replication .

Some other tweaks I tried with the the httpd.conf prefork module ,keepalive ,timeout etc after which I see 500 on the access log on the apache server nothing worked . Would really appreciate any help !

<IfModule mpm_prefork_module>
  StartServers           10
  MinSpareServers        10
  MaxSpareServers        20
  MaxClients             50
  ServerLimit            50
  MaxRequestsPerChild  500
</IfModule>
ProxyRequests On 
ProxyTimeout 600
<Proxy *>
  AddDefaultCharset Off
  Order deny,allow
  Allow from all
</Proxy>
<Proxy balancer://IOCluster stickysession=JSESSIONID>
  BalancerMember ajp://127.0.0.1:8009 min=10 max=100 route=tcruntime8009 loadfactor=1 keepalive=On timeout=600 
  BalancerMember ajp://127.0.0.1:8012 min=10 max=100 route=tcruntime8012 loadfactor=1 keepalive=On timeout=600
</Proxy>
ProxyPreserveHost On
ProxyStatus On
ProxyPass "/IO"  "balancer://IOCluster/IO"
ProxyPassReverse "/IO"  "balancer://IOCluster/IO"
Jatin
  • 1
  • Exists `` tag in `web.xml`? – Federico Sierra Jul 26 '16 at 13:38
  • Sorry for the delay Federico ! yes it does but after a lot of messing around I have made a accidental discovery .I configured memcache for the session replication and when I put the distributable tag in web.xml there is a socket timeout at the database layer and if I remove it its back to normal (similar to the static member session replication issue described above)! The cluster is setup in a single windows system . – Jatin Aug 02 '16 at 12:05

0 Answers0