Philip's Tech Blog: 6月 2012

2012年6月24日星期日

HA Active-Standby MySQL + Heartbeat 3.x + Coroysnc 1.x + Pacemaker 1.x on RHEL / CentOS - Section 6

Resources Management
Below examples show how could one mange the HA resources in between the nodes.

- Check Cluster status

[root@dbmaster-01 ~]# crm_mon -1

============

Last updated: Fri Dec 30 00:43:51 2011

Last change: Fri Dec 30 00:20:38 2011 via crm_attribute on dbmaster-01.localdomain

Stack: openais

Current DC: dbmaster-01.localdomain - partition with quorum

Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558

2 Nodes configured, 2 expected votes

5 Resources configured.

============

Online: [ dbmaster-01.localdomain dbmaster-02.localdomain ]

Resource Group: dbGroup

ClusterIP (ocf::heartbeat:IPaddr2): Started dbmaster-01.localdomain

DBstore (ocf::heartbeat:Filesystem): Started dbmaster-01.localdomain

MySQL (ocf::heartbeat:mysql): Started dbmaster-01.localdomain

Clone Set: pingclone [check-ext-conn]

Started: [ dbmaster-01.localdomain dbmaster-02.localdomain ]

- Put node to offline mode

[root@dbmaster-01 ~]# crm node standby

[root@dbmaster-01 ~]# crm_mon -1

============

Last updated: Fri Dec 30 00:44:45 2011

Last change: Fri Dec 30 00:44:39 2011 via crm_attribute on dbmaster-01.localdomain

Stack: openais

Current DC: dbmaster-01.localdomain - partition with quorum

Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558

2 Nodes configured, 2 expected votes

5 Resources configured.

============

Node dbmaster-01.localdomain: standby

Online: [ dbmaster-02.localdomain ]

Resource Group: dbGroup

ClusterIP (ocf::heartbeat:IPaddr2): Started dbmaster-02.localdomain

DBstore (ocf::heartbeat:Filesystem): Started dbmaster-02.localdomain

MySQL (ocf::heartbeat:mysql): Started dbmaster-02.localdomain

Clone Set: pingclone [check-ext-conn]

Started: [ dbmaster-02.localdomain ]

Stopped: [ check-ext-conn:0 ]

- Put node to online mode

[root@dbmaster-01 ~]# crm node online

[root@dbmaster-01 ~]# crm_mon -1

============

Last updated: Fri Dec 30 00:45:12 2011

Last change: Fri Dec 30 00:45:10 2011 via crm_attribute on dbmaster-01.localdomain

Stack: openais

Current DC: dbmaster-01.localdomain - partition with quorum

Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558

2 Nodes configured, 2 expected votes

5 Resources configured.

============

Online: [ dbmaster-01.localdomain dbmaster-02.localdomain ]

Resource Group: dbGroup

ClusterIP (ocf::heartbeat:IPaddr2): Started dbmaster-02.localdomain

DBstore (ocf::heartbeat:Filesystem): Started dbmaster-02.localdomain

MySQL (ocf::heartbeat:mysql): Started dbmaster-02.localdomain

Clone Set: pingclone [check-ext-conn]

Started: [ dbmaster-01.localdomain dbmaster-02.localdomain ]

- Migrate resources to neighbor node

[root@dbmaster-01 ~]# crm resource migrate dbGroup dbmaster-02.localdomain

[root@dbmaster-01 ~]# crm_mon -1

============

Last updated: Fri Dec 30 00:47:50 2011

Last change: Fri Dec 30 00:47:35 2011 via crm_resource on dbmaster-01.localdomain

Stack: openais

Current DC: dbmaster-01.localdomain - partition with quorum

Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558

2 Nodes configured, 2 expected votes

5 Resources configured.

============

Online: [ dbmaster-01.localdomain dbmaster-02.localdomain ]

Resource Group: dbGroup

ClusterIP (ocf::heartbeat:IPaddr2): Started dbmaster-01.localdomain

DBstore (ocf::heartbeat:Filesystem): Started dbmaster-01.localdomain

MySQL (ocf::heartbeat:mysql): Started dbmaster-01.localdomain

Clone Set: pingclone [check-ext-conn]

Started: [ dbmaster-01.localdomain dbmaster-02.localdomain ]

- Start / Stop / Restart specific resouce on node

[root@dbmaster-01 ~]# crm resource status MySQL

resource MySQL is running on: dbmaster-01.localdomain

...

[root@dbmaster-01 ~]# crm resource stop MySQL

....

[root@dbmaster-01 ~]# crm resource start MySQL

2012年6月21日星期四

HA Active-Standby MySQL + Heartbeat 3.x + Coroysnc 1.x + Pacemaker 1.x on RHEL / CentOS - Section 5

Cluster management

Corosync service is responsible for Cluster management while pacemaker is responsible for resource on top of the clustering service. The dependency of startup sequence will be 1) corosync and then 2) pacemaker. The shutdown sequence will be 1) pacemaker and then 2) corosync

- Check service status

[root@dbmaster-02 ~]# /etc/init.d/corosync status

corosync (pid 23118) is running...

[root@dbmaster-02 ~]# /etc/init.d/pacemaker status

pacemakerd (pid 8714) is running...

- Stop pacemaker and corosync

If the subject node is in active state, resources will be failed over to standby node. Alternatively if it is standby node, no changes will be made on active node.

[root@dbmaster-02 ~]# /etc/init.d/pacemaker stop

Signaling Pacemaker Cluster Manager to terminate: [ OK ]

Waiting for cluster services to unload:....... [ OK ]

[root@dbmaster-02 ~]# /etc/init.d/corosync stop

Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]

Waiting for corosync services to unload:. [ OK ]

- Start corosync and pacemaker

If there isn’t any node running in the cluster, the first up shown up in the cluster will be the active node. If there is one active node in the cluster, the 2^nd node will automatically become the standby

[root@dbmaster-02 ~]# /etc/init.d/corosync start

Starting Corosync Cluster Engine (corosync): [ OK ]

[root@dbmaster-02 ~]# /etc/init.d/pacemaker start

Starting Pacemaker Cluster Manager: [ OK ]

2012年6月18日星期一

HA Active-Standby MySQL + Heartbeat 3.x + Coroysnc 1.x + Pacemaker 1.x on RHEL / CentOS - Section 4

- Configure Cluster Resources

Now the cluster is up, and we will have to add cluster resources (e.g. VIP, MySQL DB store, MySQL DB service) on top of the cluster. We only need to run this once on dbmaster-01 as the configuration changes will be written to cluster configuration file and being replicated to dbmaster-02.

- Configure misc cluster parameter

[root@dbmaster-01 ~]# crm configure property stonith-enabled=false

[root@dbmaster-01 ~]# crm configure property no-quorum-policy=ignore

[root@dbmaster-01 ~]# crm configure property start-failure-is-fatal="false"

[root@dbmaster-01 ~]# crm configure rsc_defaults resource-stickiness=100

- Configure VIP

[root@dbmaster-01 ~]# crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params ip=192.168.0.10 cidr_netmask=32 op monitor interval=10s meta migration-threshold="10"

- Configure MySQL DB store, i.e. the shared-disk

[root@dbmaster-01 ~]# crm configure primitive DBstore ocf:heartbeat:Filesystem params device="/dev/sdb" directory="/mysql" fstype="ext4" meta migration-threshold="10"

WARNING: DBstore: default timeout 20s for start is smaller than the advised 60

WARNING: DBstore: default timeout 20s for stop is smaller than the advised 60

- Configure MySQL services

[root@dbmaster-01 ~]# crm configure primitive MySQL ocf:heartbeat:mysql params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" user="mysql" group="mysql" datadir="/mysql" log="/mysql/mysqld.log" \

> op start interval="0" timeout="60s" \

> op stop interval="0" timeout="60s" \

> op monitor interval="1min" timeout="60s" \

> meta migration-threshold="10" target-role="Started"

WARNING: MySQL: specified timeout 60s for start is smaller than the advised 120

WARNING: MySQL: specified timeout 60s for stop is smaller than the advised 120

- Configure all resources as a resource group for failover
If we don't configure them as a resource group, indivdual resources will be failovered seperately so eventually you may be seeing VIP on dbmaster01 while DB store on dbmaster02 which is something we don't want to see.

[root@dbmaster-01 ~]# crm configure group dbGroup ClusterIP DBstore MySQL

- Define external ping monitoring and failover policy

This part of configuration will be a complicated, basically it means it will try to ping the gateway. In case the active node failed to ping gateway (e.g. internet connectivity down), it will fail over all services to standby node

[root@dbmaster-01 ~]# crm configure primitive check-ext-conn ocf:pacemaker:ping \

> params host_list="192.168.0.1" multiplier="100" attempts="3" \

> op monitor interval="10s" timeout="5s" start stop \

> meta migration-threshold="10"

WARNING: check-ext-conn: default timeout 20s for start is smaller than the advised 60

WARNING: check-ext-conn: specified timeout 5s for monitor is smaller than the advised 60

[root@dbmaster-01 ~]# crm configure clone pingclone check-ext-conn meta globally-unique="false"

[root@dbmaster-01 ~]# crm

crm(live)# configure

crm(live)configure# location dbnode dbGroup \

> rule $id="dbnode-rule" pingd: defined pingd \

> rule $id="dbnode-rule-0" -inf: not_defined pingd or pingd lte 10 \

> rule $id="dbnode-rule-1" 20: uname eq dbmaster-01.localdomain \

> rule $id="dbnode-rule-2" 20: uname eq dbmaster-01

crm(live)configure# end

There are changes pending. Do you want to commit them? Yes

crm(live)configure# exit

bye

- Review all configuration details.

[root@dbmaster-01 ~]# crm configure show

node dbmaster-01.localdomain

node dbmaster-02.localdomain

primitive ClusterIP ocf:heartbeat:IPaddr2 \

params ip="192.168.0.10" cidr_netmask="32" \

op monitor interval="10s" \

meta migration-threshold="10"

primitive DBstore ocf:heartbeat:Filesystem \

params device="/dev/sdb" directory="/mysql" fstype="ext4" \

meta migration-threshold="10"

primitive MySQL ocf:heartbeat:mysql \

params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" user="mysql" group="mysql" datadir="/mysql" log="/mysql/mysqld.log" \

op start interval="0" timeout="60s" \

op stop interval="0" timeout="60s" \

op monitor interval="1min" timeout="60s" \

meta migration-threshold="10" target-role="Started"

primitive check-ext-conn ocf:pacemaker:ping \

params host_list="192.168.0.1" multiplier="100" attempts="3" \

op monitor interval="10s" timeout="5s" start stop \

meta migration-threshold="10"

group dbGroup ClusterIP DBstore MySQL

clone pingclone check-ext-conn \

meta globally-unique="false"

location dbnode dbGroup \

rule $id="dbnode-rule" pingd: defined pingd \

rule $id="dbnode-rule-0" -inf: not_defined pingd or pingd lte 10 \

rule $id="dbnode-rule-1" 20: uname eq dbmaster-01.localdomain \

rule $id="dbnode-rule-2" 20: uname eq dbmaster-01

property $id="cib-bootstrap-options" \

dc-version="1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558" \

cluster-infrastructure="openais" \

expected-quorum-votes="2" \

stonith-enabled="false" \

no-quorum-policy="ignore" \

start-failure-is-fatal="false"

rsc_defaults $id="rsc-options" \

resource-stickiness="100"

2012年6月16日星期六

HA Active-Standby MySQL + Heartbeat 3.x + Coroysnc 1.x + Pacemaker 1.x on RHEL / CentOS - Section 3

- Cluster software installation and configuration

Now it is time to proceed with cluster software installation and configuration. If you are installing on CentOS, the packages could be fetched from default yum repository but if you are doing it on a RHEL6, you will probably need to add CentOS repository.

Below packages have to be installed on both nodes.

[root@dbmaster-01 yum.repos.d]# yum -y install pacemaker corosync

Loaded plugins: product-id, rhnplugin, subscription-manager

Updating certificate-based repositories.

Setting up Install Process

Resolving Dependencies

--> Running transaction check

---> Package corosync.x86_64 0:1.4.1-4.el6 will be installed

**********************

***** detail skipped ****

**********************

Installed:

corosync.x86_64 0:1.4.1-4.el6 pacemaker.x86_64 0:1.1.6-3.el6

Dependency Installed:

cifs-utils.x86_64 0:4.8.1-5.el6

cluster-glue.x86_64 0:1.0.5-2.el6

cluster-glue-libs.x86_64 0:1.0.5-2.el6

clusterlib.x86_64 0:3.0.12.1-23.el6

corosynclib.x86_64 0:1.4.1-4.el6

keyutils.x86_64 0:1.4-3.el6

libevent.x86_64 0:1.4.13-1.el6

libgssglue.x86_64 0:0.1-11.el6

libibverbs.x86_64 0:1.1.5-3.el6

librdmacm.x86_64 0:1.0.14.1-3.el6

libtalloc.x86_64 0:2.0.1-1.1.el6

libtirpc.x86_64 0:0.2.1-5.el6

nfs-utils.x86_64 1:1.2.3-15.el6

nfs-utils-lib.x86_64 0:1.1.5-4.el6

pacemaker-cli.x86_64 0:1.1.6-3.el6

pacemaker-cluster-libs.x86_64 0:1.1.6-3.el6

pacemaker-libs.x86_64 0:1.1.6-3.el6

resource-agents.x86_64 0:3.9.2-7.el6

rpcbind.x86_64 0:0.2.0-8.el6

Complete!

- Configure Corosync and Pacemaker
Create configuration file /etc/corosync/corosync.conf. We only need to run this on dbmaster-01 as we will replicate the file over to dbmaster-02

[root@dbmaster-01 ~]# export ais_port=5405

[root@dbmaster-01 ~]# export ais_mcast=226.94.1.1

[root@dbmaster-01 ~]# export ais_addr=`ip addr | grep "inet " | grep eth0 | awk '{print $4}' | sed s/255/0/`

[root@dbmaster-01 ~]# env | grep ais_

ais_mcast=226.94.1.1

ais_port=5405

ais_addr=192.168.0.255

[root@dbmaster-01 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf

[root@dbmaster-01 ~]# sed -i.bak "s/.*mcastaddr:.*/mcastaddr:\ $ais_mcast/g" /etc/corosync/corosync.conf

[root@dbmaster-01 ~]# sed -i.bak "s/.*mcastport:.*/mcastport:\ $ais_port/g" /etc/corosync/corosync.conf

[root@dbmaster-01 ~]# sed -i.bak "s/.*bindnetaddr:.*/bindnetaddr:\ $ais_addr/g" /etc/corosync/corosync.conf

[root@dbmaster-01 ~]# cat <<-END >>/etc/corosync/service.d/pcmk

> service {

> # Load the Pacemaker Cluster Resource Manager

> name: pacemaker

> ver: 1

> }

> END

- Review the configuration file /etc/corosync/corosync.conf

[root@dbmaster-01 ~]# cd /etc/corosync

[root@dbmaster-01 corosync]# cat corosync.conf

# Please read the corosync.conf.5 manual page

compatibility: whitetank

totem {

version: 2

secauth: off

threads: 0

interface {

ringnumber: 0

bindnetaddr: 192.168.114.127

mcastaddr: 226.94.1.1

mcastport: 5405

ttl: 1

}

logging {

fileline: off

to_stderr: no

to_logfile: yes

to_syslog: yes

logfile: /var/log/cluster/corosync.log

debug: off

timestamp: on

logger_subsys {

subsys: AMF

debug: off

}

amf {

mode: disabled

}

- Replicate the configuration to neighbor node (dbmaster-02) and start corosync service.

[root@dbmaster-01 corosync]# for f in /etc/corosync/corosync.conf /etc/corosync/service.d/pcmk /etc/hosts; do scp $f dbmaster-02:$f ; done

[root@dbmaster-01 corosync]# /etc/init.d/corosync start

Starting Corosync Cluster Engine (corosync): [ OK ]

[root@dbmaster-01 corosync]# grep -e "corosync.*network interface" -e "Corosync Cluster Engine" -e "Successfully read main configuration file" /var/log/messages

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [MAIN ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service.

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] The network interface [192.168.0.11] is now up.

[root@dbmaster-01 corosync]# grep TOTEM /var/log/messages

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] Initializing transport (UDP/IP Multicast).

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] The network interface [192.168.0.11] is now up.

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [TOTEM ] A processor joined or left the membership and a new membership was formed.

[root@dbmaster-01 ~]# ssh dbmaster-02 -- /etc/init.d/corosync start

Starting Corosync Cluster Engine (corosync): [ OK ]

- Monitoring startup status of corosync
Make sure pacemaker module is loaded successfully.

[root@dbmaster-01 corosync]# grep pcmk_startup /var/log/messages

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] info: pcmk_startup: CRM: Initialized

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] Logging: Initialized pcmk_startup

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] info: pcmk_startup: Service: 10

Dec 29 03:08:39 dbmaster-01 corosync[27718]: [pcmk ] info: pcmk_startup: Local hostname: dbmaster-01.localdomain

- Startup pacemaker on both nodes

[root@dbmaster-01 ~]# chown -R hacluster:haclient /var/log/cluster

[root@dbmaster-01 ~]# /etc/init.d/pacemaker start

Starting Pacemaker Cluster Manager: [ OK ]

[root@dbmaster-01 ~]# grep -e pacemakerd.*get_config_opt -e pacemakerd.*start_child -e "Starting Pacemaker" /var/log/messages

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'pacemaker' for option: name

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found '1' for option: ver

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'pacemaker' for option: name

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found '1' for option: ver

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Defaulting to 'no' for option: use_logd

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Defaulting to 'no' for option: use_mgmtd

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'off' for option: debug

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'yes' for option: to_logfile

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found '/var/log/cluster/corosync.log' for option: logfile

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Found 'yes' for option: to_syslog

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31333]: info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: main: Starting Pacemaker 1.1.6-3.el6 (Build: a02c0f19a00c1eb2527ad38f146ebc0834814558): generated-manpages agent-manpages ascii-docs publican-docs ncurses trace-logging cman corosync-quorum corosync

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31341 for process stonith-ng

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31342 for process cib

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31343 for process lrmd

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31344 for process attrd

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31345 for process pengine

Dec 29 03:29:05 dbmaster-01 pacemakerd: [31337]: info: start_child: Forked child 31346 for process crmd

[root@dbmaster-01 ~]# ssh dbmaster-02 -- chown -R hacluster:haclient /var/log/cluster

[root@dbmaster-01 ~]# ssh dbmaster-02 -- /etc/init.d/pacemaker start

Starting Pacemaker Cluster Manager: [ OK ]

- Verify if heartbeat processes are started

[root@dbmaster-01 ~]# ps axf

PID TTY STAT TIME COMMAND

2 ? S 0:00 [kthreadd]

... lots of processes ....

27718 ? Ssl 0:00 corosync

31337 pts/0 S 0:00 pacemakerd

31341 ? Ss 0:00 \_ /usr/lib64/heartbeat/stonithd

31342 ? Ss 0:00 \_ /usr/lib64/heartbeat/cib

31343 ? Ss 0:00 \_ /usr/lib64/heartbeat/lrmd

31344 ? Ss 0:00 \_ /usr/lib64/heartbeat/attrd

31345 ? Ss 0:00 \_ /usr/lib64/heartbeat/pengine

31346 ? Ss 0:00 \_ /usr/lib64/heartbeat/crmd

[root@test-db1 corosync]# grep ERROR: /var/log/messages | grep -v unpack_resources

[root@test-db1 corosync]#

- Verify the HA cluster is running now

[root@dbmaster-01 ~]# crm_mon

============

Last updated: Thu Dec 29 05:19:52 2011

Last change: Thu Dec 29 05:07:59 2011 via crmd on dbmaster-01.localdomain

Stack: openais

Current DC: dbmaster-01.localdomain - partition with quorum

Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558

2 Nodes configured, 2 expected votes

0 Resources configured.

============

Online: [ dbmaster-01.localdomain dbmaster-02.localdomain ]

2012年6月14日星期四

HA Active-Standby MySQL + Heartbeat 3.x + Coroysnc 1.x + Pacemaker 1.x on RHEL / CentOS - Section 2

- OS Pre-configuration task
Below listed items have to be done on both master nodes.

- Disable SELinux and iptables
[root@dbmaster-01~]# getenforce
Disabled
[root@dbmaster-01~]# service iptables stop
[root@dbmaster-01~]# chkconfig iptables off

- Network configuration

Each master node has to be configured with 2 network interfaces. One is external nic for external traffic (e.g. MySQL traffic, Internet traffic … etc) while another one is internal heartbeat nic which is used to connect 2 master nodes.

In our scenario the master nodes will have IP address like this.

dbmaster-01: IP 192.168.0.11/24, Gateway 192.168.0.1
dbmaster-02: IP 192.168.0.12/24, Gateway 192.168.0.1

The VIP is 192.168.0.10/24 which will be floating between 2 nodes, depending on the server availability.

- Generate SSH-key and allow key-based authentication

[root@dbmaster-01~]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

c7:50:ac:a4:09:fb:5d:f3:1e:13:ed:2e:21:d4:a7:f7 root@dbmaster-01

The key's randomart image is:

+--[ RSA 2048]----+

| .. |

| . ... |

| o +.. . . |

| . o .o+ o o |

| . .Sooo = |

| . ... * o |

| o * . |

| o . E|

| . |

+-----------------+

[root@dbmaster-01~]# ssh-copy-id -i .ssh/id_rsa.pub root@dbmaster-02 ##### replace dbmaster-02 with dbmaster-01 if you are doing it from dbmaster-02 to dbmaster-01

The authenticity of host dbmaster-02 (192.168.0.12)' can't be established.

RSA key fingerprint is 51:73:cb:48:c3:8e:9c:39:88:38:b8:a9:70:b8:fd:76.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'dbmaster-02' (RSA) to the list of known hosts.

root@dbmaster-02's password:

Now try logging into the machine, with "ssh 'root@dbmaster-02'", and check in:

.ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[root@ dbmaster-01 ~]# ssh dbmaster-02 date##### replace dbmaster-02 with dbmaster-01 if you are doing it from dbmaster-02 to dbmaster-01

Tue Dec 27 22:08:44 EST 2011

- Configure ntpd to make sure time are in sync

[root@dbmaster-01 ~]# service ntpd stop

Shutting down ntpd: [ OK ]

[root@dbmaster-01 ~]# ntpdate ntp.asia.pool.ntp.org

27 Dec 22:11:27 ntpdate[1796]: adjust time server x.x.x.x offset 0.000983 sec

[root@dbmaster-01 ~]# service ntpd start

Starting ntpd: [ OK ]

- Add host entries to host file to make sure both nodes see each other via dns name

[root@dbmaster-01 ~]# cat /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

::1 localhost6.localdomain6 localhost6

192.168.0.11 dbmaster-01 dbmaster-01.localdomain

192.168.0.12 dbmaster-02 dbmaster-02.localdomain

[root@dbmaster-01 ~]# ping dbmaster-01

PING dbmaster-01 (192.168.0.11) 56(84) bytes of data.

64 bytes from dbmaster-01 (192.168.0.11): icmp_seq=1 ttl=64 time=0.017 ms

--- dbmaster-01 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 588ms

rtt min/avg/max/mdev = 0.017/0.017/0.017/0.000 ms

[root@dbmaster-01 ~]# ping dbmaster-02

PING dbmaster-02 (192.168.0.12) 56(84) bytes of data.

64 bytes from dbmaster-02 (192.168.0.12): icmp_seq=1 ttl=64 time=0.017 ms

--- dbmaster-01 ping statistics ---

1 packets transmitted, 1 received, 0% packet loss, time 588ms

rtt min/avg/max/mdev = 0.017/0.017/0.017/0.000 ms

- Partition the shared-disk

The shared disk will be used to store MySQL db and have to be partition from master node (either one). If you are using vmware ESXi, you could simply mount the disk on both VM and the disk will probably be presented as second disk (e.g. /dev/sdb). Please be notice that no parallel run is allowed which means only one node could access and write the shared-disk at the same time or otherwise it will crash the disk. In our example, we will assume /dev/sdb as our shared-disk.

[root@dbmaster-01 ~]# mkfs.ext4 /dev/sdb #### /dev/sdb is the shared disk in this example

- Installing MySQL Packages.

This should be pretty striaght forward, given that your server could access the Internet with any issue.

[root@dbmaster-01 ~]# yum -y install mysql-server mysql

Setting up Install Process

Resolving Dependencies

--> Running transaction check

---> Package mysql.x86_64 0:5.1.52-1.el6_0.1 will be installed

****************

**** skipped ****

****************

Dependency Installed:

perl-DBD-MySQL.x86_64 0:4.013-3.el6 perl-DBI.x86_64 0:1.609-4.el6

Complete!

- Configure MySQL
Edit /etc/my.cnf as below.

[root@dbmaster-01 ~]# cat /etc/my.cnf

[mysqld]

datadir=/mysql

socket=/mysql/mysql.sock

user=mysql

# Disabling symbolic-links is recommended to prevent assorted security risks

symbolic-links=0

innodb_rollback_on_timeout=1

innodb_lock_wait_timeout=600

log-bin=mysql-bin ## Just in case you want replication

binlog-format='ROW' ## Just in case you want replication

[mysqld_safe]

log-error=/mysql/mysqld.log

pid-file=/var/run/mysqld/mysqld.pid

- Create the mount point and mount the shared disk

[root@dbmaster-01 ~]# mkdir /mysql

[root@dbmaster-01 ~]# mount /dev/sdb /mysql

[root@dbmaster-01 ~]# chown –R mysql:mysql /mysql

- Run mysql_install_db script to install the base-db, make sure this will only be run once only.

[root@ dbmaster-01 ~]# mysql_install_db

Installing MySQL system tables...

Filling help tables...

To start mysqld at boot time you have to copy

support-files/mysql.server to the right place for your system

PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !

To do so, start the server, then issue the following commands:

/usr/bin/mysqladmin -u root password 'new-password'

/usr/bin/mysqladmin -u root -h test-db1.dpcloud.local password 'new-password'

Alternatively you can run:

/usr/bin/mysql_secure_installation

which will also give you the option of removing the test

databases and anonymous user created by default. This is

strongly recommended for production servers.

See the manual for more instructions.

You can start the MySQL daemon with:

cd /usr ; /usr/bin/mysqld_safe &

You can test the MySQL daemon with mysql-test-run.pl

cd /usr/mysql-test ; perl mysql-test-run.pl

Please report any problems with the /usr/bin/mysqlbug script!

[root@dbmaster-01 ~]# chown –R mysql:mysql /mysql

- Remember to unmount the shared-disk /mysql after that.

[root@ dbmaster-01 ~]# umount /mysql

2012年6月12日星期二

HA Active-Standby MySQL + Heartbeat 3.x + Coroysnc 1.x + Pacemaker 1.x on RHEL / CentOS - Section 1

- The HA Cluster design.

This HA MySQL clustering configuration will be based on 2 servers which are in an active-backup relationship. Below diagram explain the logical design of the setup.

Both HA-master nodes are installed with heartbeat packages, an internal nic presented to each other with private ip. Heartbeat package of both nodes will check the status of remote node and take the “active node” role when remote node is down. Active node will be responsible to take over the VIP (Virtual IP, the IP floating between 2 HA master), MySQL DB store and spawn up MySQL process to serve mysql request. If replication slave is needed, a 3rd node could be added to the cluster as replication slave and this 3rd node could hooking to the VIP to fetch binary logs but the detail procedures for replication slave is out of the scope of this howto hence it should be pretty straight forward for those that familiar with mysql master-slave replication setup.

- Hardware and software requirement of both nodes

So here are the ingredients for the HA setup.

2 server with identical hardware configuration
Minimum requirement (at least 1G memory, 20G root disk)
Each node will need a pair of nic, public and internal heartbeat nic
Public nic will be configured as public facing (or at least the nic is connected to internet to allow packages fetching)
Internal nic of both nodes will be connected to each other via either crossover cable or on same VLAN.
One shared LUN (Logical Disk Unit) to be seen and mountable by both nodes. You may probably configure this by using iscsi (e.g. openfiler), vmware shared-disk , phyiscal fiber-channel storage or DRBD disk. In this example we used a pre-configured Vmware ESX shared-disk.
One shared Virtual IP for IP failover in between 2 nodes. This IP will be on the same segment of public nic's IP.
The OS will be RHEL/CentOS 6 64bit with minimal installation. Additional software repositories have to be added so that we could fetch and install Heartbeat 3.x, Corosync 1.x and Pacemaker 1.1.x.

Good reference to improve your slide.

Note: It was quite sometime since I got this book through the O'Reilly Blogger Review Program and I am just about to write the review just recently.

So basically I am a technical geek with no concept of how a presentation should be. After reading this book this does show me how one create an interesting presentation. It is so well written and with lots of example demonstrate the idea. it is definitely a good reference for those that look forward to improve their presentation skill.

訂閱：意見 (Atom)

2012年6月24日 星期日

2012年6月21日 星期四

2012年6月18日 星期一

2012年6月16日 星期六

2012年6月14日 星期四

2012年6月12日 星期二

2012年6月24日星期日

2012年6月21日星期四

2012年6月18日星期一

2012年6月16日星期六

2012年6月14日星期四

2012年6月12日星期二