对于WEB应用集群的技术实现而言,最大的难点就是如何能在集群中的多个节点之间保持数据的一致性,会话(Session)信息是这些数据中最重要的一块。要实现这一点,大体上有两种方式,一种是把所有Session数据放到一台服务器上或者数据库中,集群中的所有节点通过访问这台Session服务器来获取数据;另一种就是在集群中的所有节点间进行Session数据的同步拷贝,任何一个节点均保存了所有的Session数据。两种方式都各有优点,第一种方式简单、易于实现,但是存在着Session服务器发生故障会导致全系统不能正常工作的风险;第二种方式可靠性更高,任一节点的故障不会对整个系统对客户访问的响应产生影响,但是技术实现上更复杂一些。常见的平台或中间件如microsoft asp.net和IBM WAS都会提供对两种共享方式的支持,tomcat也是这样,但是一般采用第二种方式。

集群说明:1. 负载均衡(Load Balance):当同一客户端发起一个请求时,apache始终将请求转发到同一个节点 (sticky session),当另一个客户端或同一客户端从一个新的浏览器窗口发起请求时,apache会把请求分发到另一节点 上,依次轮询,当然,可以在apache上设置后端tomcat的分发权重。从而达到负载均衡的效果。

2. 高可用(High availablity):当其中一台tomcat server 突然crash时,apache会将这在进行的请求分发到集群中其他tomcat server上,由于集群member 之间已经session replication,所以原来的session 会在另外一节点上继续进行,此时,请求已发生了无缝转移,在客户端完全感觉不到故障已发生。

* Tomcat 通过SimpleTcpcluster类进行基于内存的会话复制(in-memory replication)。Tomcat Cluster 通过组播(心跳包)方式决定组成员关系(通过TCP协议进行数据传输和其他交流),每一个节点在启动时和运行时都会有规律地(默认500毫秒)发送组播心跳包,同一个Cluster内的节点会在相同的组播地址和端口监听这些信息;在一定的dropTime内(默认3S)不发送组播报的节点就会被认为是死去并被从cluster删去;Session replication 请求和session 更新通过直接TCP 连接在cluster成员间传送,也就是说当replication session 时,节点会生成一个直接向其他节点的TCP连接。

配置:

1.Apache的配置

Listen 8051

<VirtualHost *:8051>

ServerAdmin root@99bill.com

ServerName localhost

ServerAlias localhost

ProxyPass /myweb balancer://cluster/myweb stickysession=JSESSIONID|jsessionid lbmethod=byrequests timeout=5 maxattempts=3

ProxyPa***everse / balancer://cluster

ProxyRequests Off

ProxyPreserveHost On

ErrorLog "logs/tctest_error.log"

CustomLog "logs/tctest_access.log" commmon

<proxy balancer://cluster>

BalancerMember ajp://192.168.55.229:8009 route=jvm_a

BalancerMember ajp://192.168.55.231:8009 route=jvm_b

</proxy>

</VirtualHost>

Tomcat会在创建session时会根据根据jvmRoute的值在sessionID后面追加route值(接下来将要配置),例如167A7621C8ACEF496A0E3D7720F7C35E.jvm1。客户端访问时,如果是已建立的session,有route值,apache就sticky session,使该请求一直分发到上次访问的tomcat server上,如果是第一次请求则根据既定规则分发。

2.Tomcat的配置

1修改server.xml

分别修改为:

<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm_a">

<Engine name="Catalina" defaultHost="localhost" jvmRoute="jvm_b">

2 在server.xnml的<Engine> </Engine> 之间添加

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"

channelSendOptions="8">

<Manager className="org.apache.catalina.ha.session.DeltaManager"

expireSessionsOnShutdown="false"

notifyListenersOnReplication="true"/>

Manager用来在节点间拷贝Session,默认使用DeltaManager,DeltaManager采用的一种all-to-all的工作方式,即集群中的节点会把Session数据向所有其他节点拷贝,而不管其他节点是否部署了当前应用。当集群中的节点数量很多并且部署着不同应用时,可以使用BackupManager,BackManager仅向部署了当前应用的节点拷贝Session。但是到目前为止BackupManager并未经过大规模测试,可靠性不及DeltaManager。

<Channel className="org.apache.catalina.tribes.group.GroupChannel">

<MemberShip className="org.apache.catalina.tribes.membership.McastService"

address="228.0.0.4"

port="45564"

frequency="500"

droptTime="3000"/>

Membership用于发现集群中的其他节点,这里的address用的是组播地址使用同一个组播地址和端口的多个节点同属一个子集群,因此通过自定义组播地址和端口就可将一个大的tomcat集群分成多个子集群。

<Receiver className="org.apache.catalina.tribes.transport.nio.NioReceiver"

address="auto"

port="4000"

autoBind="100"

seceltorTimeout="5000"

maxThreads="6"/>

receiver用于各个节点接收其他节点发送的数据,在默认配置下tomcat会从4000-4100间依次选取一个可用的端口进行接收,自定义配置时,如果多个tomcat节点在一台物理服务器上注意要使用不同的端口。

<Sender className="org.apache.catalina.tribes.transport.ReplicationTransmitter">

<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>

</Sender>

Sender用于向其他节点发送数据,具体实现通过Transport配置

<Interceptor className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>

<Interceptor className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>

</Channel>

Channel 是一个抽象的端口,和socket类似,集群member通过它收发信息。

<Valve className="org.apache.catalina.ha.tcp.ReplicationValve"

filter=""/>

Valve用于在节点向客户端响应前进行检测或进行某些操作,ReplicationValve就是用于检测当前的响应是否涉及Session数据的更新,如果是则启动Session拷贝操作,filter用于过滤请求,如客户端对图片,css,js的请求就不会涉及Session,因此不需检测,默认状态下不进行过滤,监测所有的响应。

在生产环境中使用以下选项:

<Valve className=”org.apache.catalina.ha.tcp.ReplicationValve”

filter=”.*\.gif;.*\.js;.*\.jpg;.*\.htm;.*\.html;.*\.txt;”/>

即当对静态页面图片等访问时不进行session replication。

<Valve className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>

设置此选项是,当一个节点crash时,访问跳到另一个节点,此时session ID 会将jvmRoute值和以前的session Id 绑定在一起想·

<ClusterListener className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>

<ClusterListener className="org.apache.catalina.ha.session.ClusterSessionListener"/>

</Cluster>

3修改webapps/myweb/WEN-INF/web.xml

添加<distributable/> 标签

即只在<web-app> 和</web-app>之间添加

<distributable/>

4在weapps创建一个测试应用myweb(此次试验可以直接cp –rf examples myweb)

修改index.jsp 为以下内容:

Tomcat_a:

<%@ page contentType="text/html; charset=GBK" %>

<%@ page import="java.util.*" %>

<html><head><title>Cluster Test</title></head>

<body>

<%

//HttpSession session = request.getSession(true);

System.out.println(session.getCreationTime());

out.println("<br> SESSION ID:" + session.getId()+"<br>");

out.println("Session serviced by master"+"<br>");

out.println("Session created time is :"+session.getCreationTime()+"<br>");

%>

</body>

</html>

(访问时是会显示session ID ,server name,session 创建的时间,格式如下:)

=============================================================

=============================================================

Tomcat_b:

<%@ page contentType="text/html; charset=GBK" %>

<%@ page import="java.util.*" %>

<html><head><title>Cluster Test</title></head>

<body>

<%

//HttpSession session = request.getSession(true);

System.out.println(session.getCreationTime());

out.println("<br> SESSION ID:" + session.getId()+"<br>");

out.println("Session serviced by node2"+"<br>");

out.println("Session created time is :"+session.getCreationTime()+"<br>");

%>

</body>

</html>

(访问时是会显示session ID ,server name,session 创建的时间,格式如下:)

Note :1、如果用的两台server,必须保证两台机器的时间是完全同步的,用ntp服务器同步。

2、设置hostname并修改/etc/hosts文件

Jul 7, 2011 12:53:38 PM org.apache.catalina.core.AprLifecycleListener init

INFO: The APR based Apache Tomcat Native library which allows optimal performance in production environments was not found on the java.library.path: /usr/local/jdk1.6.0_05/jre/lib/amd64/server:/usr/local/jdk1.6.0_05/jre/lib/amd64:/usr/local/jdk1.6.0_05/jre/../lib/amd64:/usr/java/packages/lib/amd64:/lib:/usr/lib

Jul 7, 2011 12:53:38 PM org.apache.tomcat.util.digester.Digester endElement

WARNING: No rules found matching 'Server/Service/Engine/Cluster/Channel/MemberShip'.

Jul 7, 2011 12:53:38 PM org.apache.tomcat.util.digester.SetPropertiesRule begin

WARNING: [SetPropertiesRule]{Server/Service/Engine/Cluster/Channel/Receiver} Setting property 'seceltorTimeout' to '5000' did not find a matching property.

Jul 7, 2011 12:53:38 PM org.apache.coyote.http11.Http11Protocol init

INFO: Initializing Coyote HTTP/1.1 on http-8080

Jul 7, 2011 12:53:38 PM org.apache.catalina.startup.Catalina load

INFO: Initialization processed in 778 ms

Jul 7, 2011 12:53:38 PM org.apache.catalina.core.StandardService start

INFO: Starting service Catalina

Jul 7, 2011 12:53:38 PM org.apache.catalina.core.StandardEngine start

INFO: Starting Servlet Engine: Apache Tomcat/6.0.30

Jul 7, 2011 12:53:38 PM org.apache.catalina.ha.tcp.SimpleTcpCluster start

INFO: Cluster is about to start

Jul 7, 2011 12:53:38 PM org.apache.catalina.tribes.transport.ReceiverBase bind

INFO: Receiver Server Socket bound to:/192.168.55.231:4000

Jul 7, 2011 12:53:38 PM org.apache.catalina.tribes.membership.McastServiceImpl setupSocket

INFO: Setting cluster mcast soTimeout to 500

Jul 7, 2011 12:53:38 PM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers

INFO: Sleeping for 1000 milliseconds to establish cluster membership, start level:4

Jul 7, 2011 12:53:38 PM org.apache.catalina.ha.tcp.SimpleTcpCluster memberAdded

INFO: Replication member added:org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 229}:4000,{192, 168, 55, 229},4000, alive=147616,id={-115 -53 23 90 -40 -79 74 -54 -90 115 -116 85 81 -106 51 73 }, payload={}, command={}, domain={}, ]

Jul 7, 2011 12:53:38 PM org.apache.catalina.ha.tcp.SimpleTcpCluster memberAdded

INFO: Replication member added:org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 231}:4001,{192, 168, 55, 231},4001, alive=8077,id={90 53 3 75 21 83 64 89 -74 -72 34 -92 -19 -97 93 12 }, payload={}, command={}, domain={}, ]

Jul 7, 2011 12:53:39 PM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers

INFO: Done sleeping, membership established, start level:4

Jul 7, 2011 12:53:39 PM org.apache.catalina.ha.tcp.SimpleTcpCluster memberAdded

INFO: Replication member added:org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 231}:4000,{192, 168, 55, 231},4000, alive=1008,id={-15 24 -37 103 96 125 77 20 -79 -51 38 52 38 101 -128 -108 }, payload={}, command={}, domain={}, ]

Jul 7, 2011 12:53:39 PM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers

INFO: Sleeping for 1000 milliseconds to establish cluster membership, start level:8

Jul 7, 2011 12:53:39 PM org.apache.catalina.tribes.io.BufferPool getBufferPool

INFO: Created a buffer pool with max size:104857600 bytes of type:org.apache.catalina.tribes.io.BufferPool15Impl

Jul 7, 2011 12:53:40 PM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers

INFO: Done sleeping, membership established, start level:8

Jul 7, 2011 12:53:40 PM org.apache.catalina.startup.HostConfig deployDescriptor

INFO: Deploying configuration descriptor host-manager.xml

Jul 7, 2011 12:53:40 PM org.apache.catalina.startup.HostConfig deployDescriptor

INFO: Deploying configuration descriptor manager.xml

Jul 7, 2011 12:53:41 PM org.apache.catalina.startup.HostConfig deployDirectory

INFO: Deploying web application directory moni2

Jul 7, 2011 12:53:41 PM org.apache.catalina.loader.WebappClassLoader validateJarFile

INFO: validateJarFile(/usr/local/apache-tomcat-6.0.30/webapps/moni2/WEB-INF/lib/j2ee.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class

Jul 7, 2011 12:53:41 PM org.apache.catalina.loader.WebappClassLoader validateJarFile

INFO: validateJarFile(/usr/local/apache-tomcat-6.0.30/webapps/moni2/WEB-INF/lib/servlet-api-2.4.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class

Jul 7, 2011 12:53:41 PM org.apache.catalina.ha.session.DeltaManager start

INFO: Register manager /moni2 to cluster element Engine with name Catalina

Jul 7, 2011 12:53:41 PM org.apache.catalina.ha.session.DeltaManager start

INFO: Starting clustering manager at /moni2

Jul 7, 2011 12:53:41 PM org.apache.catalina.ha.session.DeltaManager getAllClusterSessions

WARNING: Manager [localhost#/moni2], requesting session state from org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 229}:4000,{192, 168, 55, 229},4000, alive=150126,id={-115 -53 23 90 -40 -79 74 -54 -90 115 -116 85 81 -106 51 73 }, payload={}, command={}, domain={}, ]. This operation will timeout if no session state has been received within 60 seconds.

Jul 7, 2011 12:53:41 PM org.apache.catalina.ha.session.DeltaManager waitForSendAllSessions

INFO: Manager [localhost#/moni2]; session state send at 7/7/11 12:53 PM received in 113 ms.0.30

Jul 7, 2011 12:55:24 PM org.apache.catalina.ha.tcp.SimpleTcpCluster start

INFO: Cluster is about to start

Jul 7, 2011 12:55:24 PM org.apache.catalina.tribes.transport.ReceiverBase bind

INFO: Receiver Server Socket bound to:/192.168.55.231:4000

Jul 7, 2011 12:55:24 PM org.apache.catalina.tribes.membership.McastServiceImpl setupSocket

INFO: Setting cluster mcast soTimeout to 500

Jul 7, 2011 12:55:24 PM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers

INFO: Sleeping for 1000 milliseconds to establish cluster membership, start level:4

Jul 7, 2011 12:55:24 PM org.apache.catalina.ha.tcp.SimpleTcpCluster memberAdded

INFO: Replication member added:org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 231}:4001,{192, 168, 55, 231},4001, alive=114038,id={90 53 3 75 21 83 64 89 -74 -72 34 -92 -19 -97 93 12 }, payload={}, command={}, domain={}, ]

Jul 7, 2011 12:55:25 PM org.apache.catalina.ha.tcp.SimpleTcpCluster memberAdded

INFO: Replication member added:org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 229}:4000,{192, 168, 55, 229},4000, alive=254053,id={-115 -53 23 90 -40 -79 74 -54 -90 115 -116 85 81 -106 51 73 }, payload={}, command={}, domain={}, ]

Jul 7, 2011 12:55:25 PM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers

INFO: Done sleeping, membership established, start level:4

Jul 7, 2011 12:55:25 PM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers

INFO: Sleeping for 1000 milliseconds to establish cluster membership, start level:8

Jul 7, 2011 12:55:25 PM org.apache.catalina.ha.tcp.SimpleTcpCluster memberAdded

INFO: Replication member added:org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 231}:4000,{192, 168, 55, 231},4000, alive=1007,id={-105 -85 -108 -38 -90 97 71 126 -124 -104 86 -113 42 -65 -116 85 }, payload={}, command={}, domain={}, ]

Jul 7, 2011 12:55:25 PM org.apache.catalina.tribes.io.BufferPool getBufferPool

INFO: Created a buffer pool with max size:104857600 bytes of type:org.apache.catalina.tribes.io.BufferPool15Impl

Jul 7, 2011 12:55:26 PM org.apache.catalina.tribes.membership.McastServiceImpl waitForMembers

INFO: Done sleeping, membership established, start level:8

Jul 7, 2011 12:55:26 PM org.apache.catalina.startup.HostConfig deployDescriptor

INFO: Deploying configuration descriptor host-manager.xml

Jul 7, 2011 12:55:26 PM org.apache.catalina.startup.HostConfig deployDescriptor

INFO: Deploying configuration descriptor manager.xml

Jul 7, 2011 12:55:27 PM org.apache.catalina.startup.HostConfig deployDirectory

INFO: Deploying web application directory moni2

Jul 7, 2011 12:55:27 PM org.apache.catalina.loader.WebappClassLoader validateJarFile

INFO: validateJarFile(/usr/local/apache-tomcat-6.0.30/webapps/moni2/WEB-INF/lib/j2ee.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class

Jul 7, 2011 12:55:27 PM org.apache.catalina.loader.WebappClassLoader validateJarFile

INFO: validateJarFile(/usr/local/apache-tomcat-6.0.30/webapps/moni2/WEB-INF/lib/servlet-api-2.4.jar) - jar not loaded. See Servlet Spec 2.3, section 9.7.2. Offending class: javax/servlet/Servlet.class

Jul 7, 2011 12:55:27 PM org.apache.catalina.ha.session.DeltaManager start

INFO: Register manager /moni2 to cluster element Engine with name Catalina

Jul 7, 2011 12:55:27 PM org.apache.catalina.ha.session.DeltaManager start

INFO: Starting clustering manager at /moni2

Jul 7, 2011 12:55:27 PM org.apache.catalina.ha.session.DeltaManager getAllClusterSessions

WARNING: Manager [localhost#/moni2], requesting session state from org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 229}:4000,{192, 168, 55, 229},4000, alive=256061,id={-115 -53 23 90 -40 -79 74 -54 -90 115 -116 85 81 -106 51 73 }, payload={}, command={}, domain={}, ]. This operation will timeout if no session state has been received within 60 seconds.

Jul 7, 2011 12:55:27 PM org.apache.catalina.ha.session.DeltaManager waitForSendAllSessions

INFO: Manager [localhost#/moni2]; session state send at 7/7/11 12:55 PM received in 113 ms.

log4j:WARN No appenders could be found for logger (org.springframework.web.context.ContextLoader).

log4j:WARN Please initialize the log4j system properly.

Jul 7, 2011 12:55:28 PM org.apache.catalina.startup.HostConfig deployDirectory

INFO: Deploying web application directory examples

Jul 7, 2011 12:55:28 PM org.apache.catalina.startup.HostConfig deployDirectory

INFO: Deploying web application directory docs

Jul 7, 2011 12:55:28 PM org.apache.catalina.startup.HostConfig deployDirectory

INFO: Deploying web application directory myweb

Jul 7, 2011 12:55:28 PM org.apache.catalina.ha.session.DeltaManager start

INFO: Register manager /myweb to cluster element Engine with name Catalina

Jul 7, 2011 12:55:28 PM org.apache.catalina.ha.session.DeltaManager start

INFO: Starting clustering manager at /myweb

Jul 7, 2011 12:55:28 PM org.apache.catalina.ha.session.DeltaManager getAllClusterSessions

WARNING: Manager [localhost#/myweb], requesting session state from org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 55, 229}:4000,{192, 168, 55, 229},4000, alive=257568,id={-115 -53 23 90 -40 -79 74 -54 -90 115 -116 85 81 -106 51 73 }, payload={}, command={}, domain={}, ]. This operation will timeout if no session state has been received within 60 seconds.

Jul 7, 2011 12:55:28 PM org.apache.catalina.ha.session.DeltaManager waitForSendAllSessions

INFO: Manager [localhost#/myweb]; session state send at 7/7/11 12:55 PM received in 104 ms.

Jul 7, 2011 12:55:28 PM org.apache.catalina.startup.HostConfig deployDirectory

INFO: Deploying web application directory ROOT

Jul 7, 2011 12:55:28 PM org.apache.catalina.ha.session.JvmRouteBinderValve start

INFO: JvmRouteBinderValve started

Jul 7, 2011 12:55:28 PM org.apache.coyote.http11.Http11Protocol start

INFO: Starting Coyote HTTP/1.1 on http-8080

Jul 7, 2011 12:55:28 PM org.apache.jk.common.ChannelSocket init

INFO: JK: ajp13 listening on /0.0.0.0:8009

Jul 7, 2011 12:55:28 PM org.apache.jk.server.JkMain start

INFO: Jk running ID=0 time=0/20 config=null

Jul 7, 2011 12:55:28 PM org.apache.catalina.startup.Catalina start

INFO: Server startup in 4309 ms

信息显示:已将node-2加入组成员中

测试

1.在浏览器中输入:

说明访问的是tomcat_a这台server

cat /usr/local/apache-tomcat-6.0.30/logs/ localhost.2011-07-07.log

可以看到以下信息:

Jul 7, 2011 12:55:30 PM org.apache.catalina.core.ApplicationContext log

INFO: SessionListener: sessionDestroyed('1260F261596C49A76141FA3949F0E02B.jvm_a')

表明session replication 成功。

2.模仿事故

把tomcat_a这台机器先停掉,不停的刷新浏览器,则页面一直为:

说明session ID 和 session 创建时间没变,而jvmRoute 和tomcat Server变了,试验成功。