有关Red Hat Cluster Suite的基本概念和操作请见:
Cluster Suite Overview
Cluster Administration
下面介绍部署Tomcat故障转移集群(failover cluster)的简单例子
环境:
2台Dell 1850
节点1:Perf-LG-6.s3lab.mot.com 192.168.16.16
节点2:Perf-LG-7.s3lab.mot.com 192.168.16.17
打算部署一个2节点的集群, 浮动IP用192.168.16.31
1. 安装集群软件
yum -y groupinstall "Clustering"
[root@Perf-LG-6.s3lab.mot.com ~]# yum -y groupinstall "Clustering" Loading "fastestmirror" plugin Loading mirror speeds from cached hostfile * EPEL-base: 192.168.11.16 * base: 192.168.11.16 * update: 192.168.11.16 base 100% |=========================| 1.1 kB 00:00 update 100% |=========================| 951 B 00:00 primary.xml.gz 100% |=========================| 434 kB 00:00 update : ################################################## 959/959 Setting up Group Process Loading mirror speeds from cached hostfile * EPEL-base: 192.168.11.16 * base: 192.168.11.16 * update: 192.168.11.16 Resolving Dependencies --> Running transaction check ---> Package piranha.x86_64 0:0.8.4-9.3.el5 set to be updated ---> Package modcluster.x86_64 0:0.12.0-7.el5.centos set to be updated --> Processing Dependency: libcman.so.2()(64bit) for package: modcluster ---> Package system-config-cluster.noarch 0:1.0.52-1.1 set to be updated ---> Package cluster-cim.x86_64 0:0.12.0-7.el5.centos set to be updated --> Processing Dependency: tog-pegasus for package: cluster-cim ---> Package luci.x86_64 0:0.12.0-7.el5.centos.3 set to be updated --> Processing Dependency: python-imaging for package: luci ---> Package rgmanager.x86_64 0:2.0.38-2.el5_2.1 set to be updated ---> Package ricci.x86_64 0:0.12.0-7.el5.centos.3 set to be updated ---> Package ipvsadm.x86_64 0:1.24-8.1 set to be updated ---> Package cluster-snmp.x86_64 0:0.12.0-7.el5.centos set to be updated --> Processing Dependency: net-snmp for package: cluster-snmp --> Running transaction check ---> Package cman.x86_64 0:2.0.84-2.el5_2.3 set to be updated --> Processing Dependency: libSaCkpt.so.2(OPENAIS_CKPT_B.01.01)(64bit) for package: cman --> Processing Dependency: perl(XML::LibXML) for package: cman --> Processing Dependency: openais for package: cman --> Processing Dependency: libSaCkpt.so.2()(64bit) for package: cman --> Processing Dependency: libcpg.so.2()(64bit) for package: cman --> Processing Dependency: libcpg.so.2(OPENAIS_CPG_1.0)(64bit) for package: cman --> Processing Dependency: perl(Net::Telnet) for package: cman ---> Package net-snmp.x86_64 1:5.3.1-24.el5_2.2 set to be updated --> Processing Dependency: libsensors.so.3()(64bit) for package: net-snmp ---> Package tog-pegasus.x86_64 2:2.7.0-2.el5_2.1 set to be updated ---> Package python-imaging.x86_64 0:1.1.5-5.el5 set to be updated --> Processing Dependency: tkinter for package: python-imaging --> Processing Dependency: libtk8.4.so()(64bit) for package: python-imaging --> Running transaction check ---> Package lm_sensors.x86_64 0:2.10.0-3.1 set to be updated ---> Package perl-XML-LibXML.x86_64 0:1.58-5 set to be updated --> Processing Dependency: perl-XML-NamespaceSupport for package: perl-XML-LibXML --> Processing Dependency: perl-XML-LibXML-Common for package: perl-XML-LibXML --> Processing Dependency: perl(XML::SAX::Exception) for package: perl-XML-LibXML --> Processing Dependency: perl(XML::LibXML::Common) for package: perl-XML-LibXML --> Processing Dependency: perl-XML-SAX for package: perl-XML-LibXML --> Processing Dependency: perl(XML::SAX::DocumentLocator) for package: perl-XML-LibXML --> Processing Dependency: perl(XML::SAX::Base) for package: perl-XML-LibXML --> Processing Dependency: perl(XML::NamespaceSupport) for package: perl-XML-LibXML ---> Package openais.x86_64 0:0.80.3-15.el5 set to be updated ---> Package tkinter.x86_64 0:2.4.3-21.el5 set to be updated --> Processing Dependency: libTix8.4.so()(64bit) for package: tkinter ---> Package perl-Net-Telnet.noarch 0:3.03-7.el5 set to be updated ---> Package tk.x86_64 0:8.4.13-5.el5_1.1 set to be updated --> Running transaction check ---> Package perl-XML-NamespaceSupport.noarch 0:1.09-1.2.1 set to be updated ---> Package perl-XML-LibXML-Common.x86_64 0:0.13-8.2.2 set to be updated ---> Package perl-XML-SAX.noarch 0:0.14-5 set to be updated ---> Package tix.x86_64 1:8.4.0-11.fc6 set to be updated --> Finished Dependency Resolution Dependencies Resolved ============================================================================= Package Arch Version Repository Size ============================================================================= Installing: cluster-cim x86_64 0.12.0-7.el5.centos base 142 k cluster-snmp x86_64 0.12.0-7.el5.centos base 139 k luci x86_64 0.12.0-7.el5.centos.3 base 27 M piranha x86_64 0.8.4-9.3.el5 base 714 k rgmanager x86_64 2.0.38-2.el5_2.1 update 294 k ricci x86_64 0.12.0-7.el5.centos.3 base 1.1 M system-config-cluster noarch 1.0.52-1.1 base 290 k Installing for dependencies: cman x86_64 2.0.84-2.el5_2.3 update 649 k ipvsadm x86_64 1.24-8.1 base 31 k lm_sensors x86_64 2.10.0-3.1 base 504 k modcluster x86_64 0.12.0-7.el5.centos base 331 k net-snmp x86_64 1:5.3.1-24.el5_2.2 update 702 k openais x86_64 0.80.3-15.el5 base 374 k perl-Net-Telnet noarch 3.03-7.el5 EPEL-base 56 k perl-XML-LibXML x86_64 1.58-5 base 230 k perl-XML-LibXML-Common x86_64 0.13-8.2.2 base 16 k perl-XML-NamespaceSupport noarch 1.09-1.2.1 base 15 k perl-XML-SAX noarch 0.14-5 base 75 k python-imaging x86_64 1.1.5-5.el5 base 408 k tix x86_64 1:8.4.0-11.fc6 base 333 k tk x86_64 8.4.13-5.el5_1.1 base 901 k tkinter x86_64 2.4.3-21.el5 base 281 k tog-pegasus x86_64 2:2.7.0-2.el5_2.1 update 6.8 M Transaction Summary ============================================================================= Install 23 Package(s) Update 0 Package(s) Remove 0 Package(s) Total download size: 41 M Downloading Packages: (1/23): ricci-0.12.0-7.el 100% |=========================| 1.1 MB 00:00 (2/23): cman-2.0.84-2.el5 100% |=========================| 649 kB 00:00 (3/23): lm_sensors-2.10.0 100% |=========================| 504 kB 00:00 (4/23): tix-8.4.0-11.fc6. 100% |=========================| 333 kB 00:00 (5/23): piranha-0.8.4-9.3 100% |=========================| 714 kB 00:00 (6/23): python-imaging-1. 100% |=========================| 408 kB 00:00 (7/23): cluster-cim-0.12. 100% |=========================| 142 kB 00:00 (8/23): perl-XML-SAX-0.14 100% |=========================| 75 kB 00:00 (9/23): tkinter-2.4.3-21. 100% |=========================| 281 kB 00:00 (10/23): net-snmp-5.3.1-2 100% |=========================| 702 kB 00:00 (11/23): tk-8.4.13-5.el5_ 100% |=========================| 901 kB 00:00 (12/23): modcluster-0.12. 100% |=========================| 331 kB 00:00 (13/23): system-config-cl 100% |=========================| 290 kB 00:00 (14/23): perl-XML-LibXML- 100% |=========================| 230 kB 00:00 (15/23): rgmanager-2.0.38 100% |=========================| 294 kB 00:00 (16/23): luci-0.12.0-7.el 100% |=========================| 27 MB 00:02 (17/23): cluster-snmp-0.1 100% |=========================| 139 kB 00:00 (18/23): perl-XML-LibXML- 100% |=========================| 16 kB 00:00 (19/23): perl-XML-Namespa 100% |=========================| 15 kB 00:00 (20/23): perl-Net-Telnet- 100% |=========================| 56 kB 00:00 (21/23): tog-pegasus-2.7. 100% |=========================| 6.8 MB 00:00 (22/23): ipvsadm-1.24-8.1 100% |=========================| 31 kB 00:00 (23/23): openais-0.80.3-1 100% |=========================| 374 kB 00:00 Running rpm_check_debug Running Transaction Test Finished Transaction Test Transaction Test Succeeded Running Transaction Installing: tk ####################### [ 1/23] Installing: tix ####################### [ 2/23] Installing: tkinter ####################### [ 3/23] Installing: python-imaging ####################### [ 4/23] Installing: lm_sensors ####################### [ 5/23] Installing: net-snmp ####################### [ 6/23] Installing: tog-pegasus ####################### [ 7/23] Installing: perl-XML-LibXML-Common ####################### [ 8/23] Installing: ipvsadm ####################### [ 9/23] Installing: openais ####################### [10/23] Installing: perl-XML-NamespaceSupport ####################### [11/23] Installing: perl-XML-SAX ####################### [12/23] Installing: perl-XML-LibXML ####################### [13/23] could not find ParserDetails.ini in /usr/lib/perl5/vendor_perl/5.8.8/XML/SAX Installing: perl-Net-Telnet ####################### [14/23] Installing: piranha ####################### [15/23] Installing: luci ####################### [16/23] Installing: cman ####################### [17/23] Installing: modcluster ####################### [18/23] Installing: cluster-snmp ####################### [19/23] Installing: rgmanager ####################### [20/23] Installing: system-config-cluster ####################### [21/23] Installing: cluster-cim ####################### [22/23] Installing: ricci ####################### [23/23] Installed: cluster-cim.x86_64 0:0.12.0-7.el5.centos cluster-snmp.x86_64 0:0.12.0-7.el5.centos luci.x86_64 0:0.12.0-7.el5.centos.3 piranha.x86_64 0:0.8.4-9.3.el5 rgmanager.x86_64 0:2.0.38-2.el5_2.1 ricci.x86_64 0:0.12.0-7.el5.centos.3 system-config-cluster.noarch 0:1.0.52-1.1 Dependency Installed: cman.x86_64 0:2.0.84-2.el5_2.3 ipvsadm.x86_64 0:1.24-8.1 lm_sensors.x86_64 0:2.10.0-3.1 modcluster.x86_64 0:0.12.0-7.el5.centos net-snmp.x86_64 1:5.3.1-24.el5_2.2 openais.x86_64 0:0.80.3-15.el5 perl-Net-Telnet.noarch 0:3.03-7.el5 perl-XML-LibXML.x86_64 0:1.58-5 perl-XML-LibXML-Common.x86_64 0:0.13-8.2.2 perl-XML-NamespaceSupport.noarch 0:1.09-1.2.1 perl-XML-SAX.noarch 0:0.14-5 python-imaging.x86_64 0:1.1.5-5.el5 tix.x86_64 1:8.4.0-11.fc6 tk.x86_64 0:8.4.13-5.el5_1.1 tkinter.x86_64 0:2.4.3-21.el5 tog-pegasus.x86_64 2:2.7.0-2.el5_2.1 Complete! [root@Perf-LG-6.s3lab.mot.com ~]#
2. 安装Tomcat 6
useradd tomcat tar -C /home/tomcat -zxf /u01/software/blur/apache-tomcat-6.0.18.blur.tar.gz chown -R tomcat:tomcat /home/tomcat/apache-tomcat-6.0.18.blur mkdir -p /usr/java tar -C /usr/java -xf /u01/software/blur/jdk1.6.0_11.tar cat >/etc/profile.d/java.sh <<'EOF' # # Automatically generated file, check puppet master to make changes # export JAVA_HOME=/usr/java/jdk1.6.0_11 export PATH=$JAVA_HOME/bin:$PATH export CATALINA_OPTS="-Xms512m -Xmx512m -XX:MaxPermSize=256M" # perhaps pass the tc version as a param export CATALINA_PID=/home/tomcat/apache-tomcat-6.0.18.blur/bin/tomcat.pid export LD_LIBRARY_PATH=/usr/local/apr/lib EOF chmod +x /etc/profile.d/java.sh
3. 将Tomcat服务的监听地址指定为浮动IP地址
vi /home/tomcat/apache-tomcat-6.0.18.blur/conf/server.xml
<!-- A "Connector" represents an endpoint by which requests are received and responses are returned. Documentation at : Java HTTP Connector: /docs/config/http.html (blocking & non-blocking) Java AJP Connector: /docs/config/ajp.html APR (HTTP/AJP) Connector: /docs/apr.html Define a non-SSL HTTP/1.1 Connector on port 8080 --> <Connector port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" address="192.168.16.31" />
4. 为Tomcat增加一种资源类型
cat >/usr/share/cluster/tomcat6.sh <<'EOFEOFEOF' #!/bin/sh # # Startup script for Tomcat Servlet Engine # # (Automatically generated file, check puppet master to make changes) # # chkconfig: 345 86 14 # description: Tomcat Servlet Engine # processname: tomcat # pidfile: /home/tomcat/apache-tomcat-6.0.16/bin/tomcat.pid # export LC_ALL=C export LANG=C export PATH=/bin:/sbin:/usr/bin:/usr/sbin #. $(dirname $0)/ocf-shellfuncs #. $(dirname $0)/utils/config-utils.sh #. $(dirname $0)/utils/messages.sh #. $(dirname $0)/utils/ra-skelet.sh . /etc/init.d/functions # User under which tomcat will run TOMCAT_USER=tomcat TOMCAT_HOMELOC=/home/tomcat/apache-tomcat-6.0.18.blur RETVAL=0 prog=tomcat6.0.18.blur # start, debug, stop, and status functions meta_data() { cat <<'EOF' <?xml version="1.0"?> <resource-agent version="rgmanager 2.0" name="tomcat6"> <version>1.0</version> <longdesc lang="en"> This defines an instance of Tomcat 6 server </longdesc> <shortdesc lang="en"> Defines a Tomcat 6 server </shortdesc> <parameters> <parameter name="name" primary="1"> <longdesc lang="en"> Specifies a service name for logging and other purposes </longdesc> <shortdesc lang="en"> Name </shortdesc> <content type="string"/> </parameter> <parameter name="service_name" inherit="service%name"> <longdesc lang="en"> Inherit the service name. We need to know the service name in order to determine file systems and IPs for this service. </longdesc> <shortdesc lang="en"> Inherit the service name. </shortdesc> <content type="string"/> </parameter> </parameters> <actions> <action name="start" timeout="60"/> <action name="stop" timeout="60"/> <!-- Checks to see if it''s mounted in the right place --> <action name="status" interval="10" timeout="10"/> <action name="monitor" interval="10" timeout="10"/> <!-- <action name="status" depth="*" timeout="120" interval="5m"/> <action name="monitor" depth="*" timeout="120" interval="5m"/> --> <action name="meta-data" timeout="10"/> <action name="verify-all" timeout="10"/> </actions> <special tag="rgmanager"> </special> </resource-agent> EOF } verify_all() { return 0 } start() { # Start Tomcat in normal mode SHUTDOWN_PORT=`netstat -vatn|grep LISTEN|grep 8005|wc -l` if [ $SHUTDOWN_PORT -ne 0 ]; then echo -n "Tomcat already started" echo_success echo else echo "Starting tomcat..." chown -R $TOMCAT_USER:$TOMCAT_USER $TOMCAT_HOMELOC/* su -l $TOMCAT_USER -c "$TOMCAT_HOMELOC/bin/startup.sh" SHUTDOWN_PORT=`netstat -vatn|grep LISTEN|grep 8005|wc -l` while [ $SHUTDOWN_PORT -eq 0 ]; do sleep 1 # echo -n "." SHUTDOWN_PORT=`netstat -vatn|grep LISTEN|grep 8005|wc -l` done RETVAL=$? echo echo -n "Tomcat started in normal mode" echo_success echo [ $RETVAL=0 ] && touch /var/lock/subsys/tomcat6 fi } debug() { # Start Tomcat in debug mode SHUTDOWN_PORT=`netstat -vatn|grep LISTEN|grep 8005|wc -l` if [ $SHUTDOWN_PORT -ne 0 ]; then echo -n "Tomcat already started" echo_success echo else echo "Starting tomcat in debug mode..." chown -R $TOMCAT_USER:$TOMCAT_USER $TOMCAT_HOMELOC/* su -l $TOMCAT_USER -c "$TOMCAT_HOMELOC/bin/catalina.sh jpda start" SHUTDOWN_PORT=`netstat -vatn|grep LISTEN|grep 8005|wc -l` while [ $SHUTDOWN_PORT -eq 0 ]; do sleep 1 # echo -n "." SHUTDOWN_PORT=`netstat -vatn|grep LISTEN|grep 8005|wc -l` done RETVAL=$? echo echo -n "Tomcat started in debug mode" echo_success echo [ $RETVAL=0 ] && touch /var/lock/subsys/tomcat6 fi } stop() { SHUTDOWN_PORT=`netstat -vatn|grep LISTEN|grep 8005|wc -l` if [ $SHUTDOWN_PORT -eq 0 ]; then echo -n "Tomcat already stopped" echo_success echo else echo "Stopping tomcat..." su -l $TOMCAT_USER -c "$TOMCAT_HOMELOC/bin/shutdown.sh -force" RETVAL=$? sleep 5 echo_success echo -n "Tomcat stopped" # tomcat smackdown #kill -9 `ps -ef |grep tomcat | grep -v grep | awk '{print $2}'` 2> /dev/null echo [ $RETVAL=0 ] && rm -f /var/lock/subsys/tomcat6 $TOMCAT_HOMELOC/bin/tomcat.pid fi } status() { SHUTDOWN_PORT=`netstat -vatn|grep LISTEN|grep 8005|wc -l` if [ $SHUTDOWN_PORT -eq 0 ]; then echo -n "Tomcat stopped" echo_success echo return $OCF_ERR_GENERIC else MODE="normal" JPDA_PORT=`netstat -vatn|grep LISTEN|grep 8000|wc -l` if [ $JPDA_PORT -ne 0 ]; then MODE="debug" fi echo "Tomcat running in $MODE mode" fi } case "$1" in start) verify_all && start exit $? ;; debug) debug ;; stop) verify_all && stop exit $? ;; restart) verify_all stop start exit $? ;; redebug) stop debug ;; status|monitor) verify_all status RETVAL=$? ;; meta-data) meta_data exit 0 ;; verify-all) verify_all exit $? ;; *) echo "Usage: $0 {start|debug|stop|restart|redebug|status|meta-data}" exit $OCF_ERR_GENERIC esac exit $RETVAL EOFEOFEOF chmod +x /usr/share/cluster/tomcat6.sh
5. 配置集群
cat >/etc/cluster/cluster.conf<<EOF <?xml version="1.0" ?> <cluster alias="new_cluster" config_version="17" name="new_cluster"> <fence_daemon post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="192.168.16.16" nodeid="1" votes="1"> <fence> <method name="1"> <device name="testFence" nodename="192.168.16.16"/> </method> </fence> </clusternode> <clusternode name="192.168.16.17" nodeid="2" votes="1"> <fence> <method name="1"> <device name="testFence" nodename="192.168.16.17"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices> <fencedevice agent="fence_manual" name="testFence"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="testFailoverDom" ordered="0" restricted="0"> <failoverdomainnode name="192.168.16.16" priority="1"/> <failoverdomainnode name="192.168.16.17" priority="1"/> </failoverdomain> </failoverdomains> <resources> <ip address="192.168.16.31" monitor_link="1"/> <tomcat6 name="tc6test"/> </resources> <service autostart="1" domain="testFailoverDom" name="tc6svc" recovery="relocate"> <ip ref="192.168.16.31"> <tomcat6 ref="tc6test"/> </ip> </service> </rm> </cluster> EOF
6. 启动集群服务
service rgmanager stop service cman stop service cman start service rgmanager start
[root@Perf-LG-6.s3lab.mot.com ~]# service rgmanager stop Cluster Service Manager is stopped. [root@Perf-LG-6.s3lab.mot.com ~]# service cman stop Stopping cluster: Stopping fencing... done Stopping cman... done Stopping ccsd... done Unmounting configfs... done [ OK ] [root@Perf-LG-6.s3lab.mot.com ~]# service cman start Starting cluster: Loading modules... done Mounting configfs... done Starting ccsd... done Starting cman... done Starting daemons... done Starting fencing... done [ OK ] [root@Perf-LG-6.s3lab.mot.com ~]# service rgmanager start Starting Cluster Service Manager: [ OK ]
root 30617 35 0 03:03 ? 00:00:00 [gfs2_scand] root 30619 35 0 03:03 ? 00:00:00 [glock_workqueue] root 30620 35 0 03:03 ? 00:00:00 [glock_workqueue] root 30637 1 0 03:03 ? 00:00:00 /sbin/ccsd root 30643 1 0 03:03 ? 00:00:00 aisexec root 30653 1 0 03:03 ? 00:00:00 /sbin/groupd root 30661 1 0 03:03 ? 00:00:00 /sbin/fenced root 30667 1 0 03:03 ? 00:00:00 /sbin/dlm_controld root 30673 1 0 03:03 ? 00:00:00 /sbin/gfs_controld root 30703 1 0 03:04 ? 00:00:00 clurgmgrd root 30704 30703 0 03:04 ? 00:00:00 clurgmgrd root 30705 35 0 03:04 ? 00:00:00 [dlm_astd] root 30706 35 0 03:04 ? 00:00:00 [dlm_scand] root 30707 35 0 03:04 ? 00:00:00 [dlm_recv] root 30708 35 0 03:04 ? 00:00:00 [dlm_send] root 30709 35 0 03:04 ? 00:00:00 [dlm_recoverd] root 31125 28541 0 03:04 pts/7 00:00:00 ps -ef
停止cman服务如果停不掉, 可用
cman_tool leave force cman_tool leave force remove然后再停
服务rgmanager起来之后,tomcat6被自动启动
[root@Perf-LG-6.s3lab.mot.com ~]# ps -ef|grep java tomcat 4542 1 97 04:50 ? 00:00:06 /usr/java/jdk1.6.0_11/bin/java -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=/home/tomcat/apache-tomcat-6.0.18.blur/conf/logging.properties -Xms512m -Xmx512m -XX:MaxPermSize=256M -Djava.endorsed.dirs=/home/tomcat/apache-tomcat-6.0.18.blur/endorsed -classpath :/home/tomcat/apache-tomcat-6.0.18.blur/bin/bootstrap.jar -Dcatalina.base=/home/tomcat/apache-tomcat-6.0.18.blur -Dcatalina.home=/home/tomcat/apache-tomcat-6.0.18.blur -Djava.io.tmpdir=/home/tomcat/apache-tomcat-6.0.18.blur/temp org.apache.catalina.startup.Bootstrap start root 4679 28541 0 04:50 pts/7 00:00:00 grep java
7. 测试
如果在节点1上杀掉tomcat进程, 大约过10秒钟后, 在节点2上重启
略
-fin-
1 comment:
I'm french so sorry for my english.
Good job! You help me for my cluster under CentOS 5 for 2 nodes with IP, Tomcat and Postgres.
Thanks :)
Post a Comment