2012年6月16日 星期六

HA Active-Standby MySQL + Heartbeat 3.x + Coroysnc 1.x + Pacemaker 1.x on RHEL / CentOS - Section 3

- Cluster software installation and configuration

Now it is time to proceed with cluster software installation and configuration. If you are installing on CentOS, the packages could be fetched from default yum repository but if you are doing it on a RHEL6, you will probably need to add CentOS repository.


Below packages have to be installed on both nodes.

[root@dbmaster-01 yum.repos.d]# yum -y install pacemaker corosync
Loaded plugins: product-id, rhnplugin, subscription-manager
Updating certificate-based repositories.
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package corosync.x86_64 0:1.4.1-4.el6 will be installed
**********************
***** detail skipped ****
**********************
Installed:
corosync.x86_64 0:1.4.1-4.el6 pacemaker.x86_64 0:1.1.6-3.el6

Dependency Installed:
cifs-utils.x86_64 0:4.8.1-5.el6
cluster-glue.x86_64 0:1.0.5-2.el6
cluster-glue-libs.x86_64 0:1.0.5-2.el6
clusterlib.x86_64 0:3.0.12.1-23.el6
corosynclib.x86_64 0:1.4.1-4.el6
keyutils.x86_64 0:1.4-3.el6
libevent.x86_64 0:1.4.13-1.el6
libgssglue.x86_64 0:0.1-11.el6
libibverbs.x86_64 0:1.1.5-3.el6
librdmacm.x86_64 0:1.0.14.1-3.el6
libtalloc.x86_64 0:2.0.1-1.1.el6
libtirpc.x86_64 0:0.2.1-5.el6
nfs-utils.x86_64 1:1.2.3-15.el6
nfs-utils-lib.x86_64 0:1.1.5-4.el6
pacemaker-cli.x86_64 0:1.1.6-3.el6
pacemaker-cluster-libs.x86_64 0:1.1.6-3.el6
pacemaker-libs.x86_64 0:1.1.6-3.el6
resource-agents.x86_64 0:3.9.2-7.el6
rpcbind.x86_64 0:0.2.0-8.el6

Complete!

- Configure Corosync and Pacemaker
Create configuration file /etc/corosync/corosync.conf. We only need to run this on dbmaster-01 as we will replicate the file over to dbmaster-02

[root@dbmaster-01 ~]# export ais_port=5405
[root@dbmaster-01 ~]# export ais_mcast=226.94.1.1
[root@dbmaster-01 ~]# export ais_addr=`ip addr | grep "inet " | grep eth0 | awk '{print $4}' | sed s/255/0/`
[root@dbmaster-01 ~]# env | grep ais_
ais_mcast=226.94.1.1
ais_port=5405
ais_addr=192.168.0.255
[root@dbmaster-01 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
[root@dbmaster-01 ~]# sed -i.bak "s/.*mcastaddr:.*/mcastaddr:\ $ais_mcast/g" /etc/corosync/corosync.conf
[root@dbmaster-01 ~]# sed -i.bak "s/.*mcastport:.*/mcastport:\ $ais_port/g" /etc/corosync/corosync.conf
[root@dbmaster-01 ~]# sed -i.bak "s/.*bindnetaddr:.*/bindnetaddr:\ $ais_addr/g" /etc/corosync/corosync.conf
[root@dbmaster-01 ~]# cat <<-END >>/etc/corosync/service.d/pcmk
> service {
> # Load the Pacemaker Cluster Resource Manager
> name: pacemaker
> ver: 1
> }
> END

- Review the configuration file /etc/corosync/corosync.conf

[root@dbmaster-01 ~]# cd /etc/corosync
[root@dbmaster-01 corosync]# cat corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
version: 2
secauth: off
threads: 0
interface {
ringnumber: 0
bindnetaddr: 192.168.114.127
mcastaddr: 226.94.1.1
mcastport: 5405
ttl: 1
}
}

logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}

amf {
mode: disabled
}

- Replicate the configuration to neighbor node (dbmaster-02) and start corosync service.

[root@dbmaster-01 corosync]# for f in /etc/corosync/corosync.conf /etc/corosync/service.d/pcmk /etc/hosts; do scp $f dbmaster-02:$f ; done
[root@dbmaster-01 corosync]# /etc/init.d/corosync start
Starting Corosync Cluster Engine (corosync): [ OK ]
[root@dbmaster-01 corosync]# grep -e "corosync.*network interface" -e "Corosync Cluster Engine" -e "Successfully read main configuration file" /var/log/messages
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service.
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] The network interface [192.168.0.11] is now up.
[root@dbmaster-01 corosync]# grep TOTEM /var/log/messages
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] Initializing transport (UDP/IP Multicast).
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] The network interface [192.168.0.11] is now up.
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
[root@dbmaster-01 ~]# ssh dbmaster-02 -- /etc/init.d/corosync start
Starting Corosync Cluster Engine (corosync): [ OK ]

- Monitoring startup status of corosync
Make sure pacemaker module is loaded successfully.

[root@dbmaster-01 corosync]# grep pcmk_startup /var/log/messages
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] info: pcmk_startup: CRM: Initialized
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] Logging: Initialized pcmk_startup
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] info: pcmk_startup: Service: 10
Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] info: pcmk_startup: Local hostname: dbmaster-01.localdomain

- Startup pacemaker on both nodes
[root@dbmaster-01 ~]# chown -R hacluster:haclient /var/log/cluster
[root@dbmaster-01 ~]# /etc/init.d/pacemaker start
Starting Pacemaker Cluster Manager: [ OK ]
[root@dbmaster-01 ~]# grep -e pacemakerd.*get_config_opt -e pacemakerd.*start_child -e "Starting Pacemaker" /var/log/messages
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'pacemaker' for option: name
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found '1' for option: ver
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'pacemaker' for option: name
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found '1' for option: ver
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Defaulting to 'no' for option: use_logd
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Defaulting to 'no' for option: use_mgmtd
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'off' for option: debug
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'yes' for option: to_logfile
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found '/var/log/cluster/corosync.log' for option: logfile
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'yes' for option: to_syslog
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: main: Starting Pacemaker 1.1.6-3.el6 (Build: a02c0f19a00c1eb2527ad38f146ebc0834814558): generated-manpages agent-manpages ascii-docs publican-docs ncurses trace-logging cman corosync-quorum corosync
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31341 for process stonith-ng
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31342 for process cib
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31343 for process lrmd
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31344 for process attrd
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31345 for process pengine
Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31346 for process crmd

[root@dbmaster-01 ~]# ssh dbmaster-02 -- chown -R hacluster:haclient /var/log/cluster
[root@dbmaster-01 ~]# ssh dbmaster-02 -- /etc/init.d/pacemaker start
Starting Pacemaker Cluster Manager: [ OK ]

- Verify if heartbeat processes are started
[root@dbmaster-01 ~]# ps axf
PID TTY STAT TIME COMMAND
2 ? S 0:00 [kthreadd]
... lots of processes ....
27718 ? Ssl 0:00 corosync
31337 pts/0 S 0:00 pacemakerd
31341 ? Ss 0:00 \_ /usr/lib64/heartbeat/stonithd
31342 ? Ss 0:00 \_ /usr/lib64/heartbeat/cib
31343 ? Ss 0:00 \_ /usr/lib64/heartbeat/lrmd
31344 ? Ss 0:00 \_ /usr/lib64/heartbeat/attrd
31345 ? Ss 0:00 \_ /usr/lib64/heartbeat/pengine
31346 ? Ss 0:00 \_ /usr/lib64/heartbeat/crmd
[root@test-db1 corosync]# grep ERROR: /var/log/messages | grep -v unpack_resources
[root@test-db1 corosync]#

- Verify the HA cluster is running now

[root@dbmaster-01 ~]# crm_mon
============
Last updated: Thu Dec 29 05:19:52 2011
Last change: Thu Dec 29 05:07:59 2011 via crmd on dbmaster-01.localdomain
Stack: openais
Current DC: dbmaster-01.localdomain - partition with quorum
Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558
2 Nodes configured, 2 expected votes
0 Resources configured.
============

Online: [ dbmaster-01.localdomain dbmaster-02.localdomain ]

沒有留言:

張貼留言