谢太数据库: setting up MySQL Cluster 配置MySQL集群

setting up MySQL Cluster
配置MySQL集群

简单介绍了MySQL集群的安装配置和增加数据节点的操作

数据节点(Data Node):
数据节点A: dev24-db-1
数据节点B: dev24-rpt-1
数据节点C: qa-blur-db-1(*用于测试增加节点)
数据节点D: dev-db-1(*用于测试增加节点)

SQL节点(SQL Node):
dev-blur-db-1

管理节点(MGMT Node):
qa-pxy-1

==================================================
安装集群
参考文档:MySQL Cluster Multi-Computer How-To

1. 安装软件

数据节点上安装NDB存储引擎:

rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-storage-7.0.5-0.rhel5.x86_64.rpm

[root@DEV24-DB-1 ~]# rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-storage-7.0.5-0.rhel5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-gpl-stora########################################### [100%]
[root@DEV24-DB-1 ~]#

SQL节点上安装MySQL服务器

rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-server-7.0.5-0.rhel5.x86_64.rpm

[root@DEV-Blur-DB-1 ~]# rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-server-7.0.5-0.rhel5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-gpl-serve########################################### [100%]
090422 15:31:34 [Warning] Forcing shutdown of 3 plugins
090422 15:31:34 [Warning] Forcing shutdown of 3 plugins

PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
To do so, start the server, then issue the following commands:

/usr/bin/mysqladmin -u root password 'new-password'
/usr/bin/mysqladmin -u root -h DEV-Blur-DB-1.s3lab.mot.com password 'new-password'

Alternatively you can run:
/usr/bin/mysql_secure_installation

which will also give you the option of removing the test
databases and anonymous user created by default.  This is
strongly recommended for production servers.

See the manual for more instructions.

Please report any problems with the /usr/bin/mysqlbug script!

The latest information about MySQL is available at http://www.mysql.com/
Support MySQL by buying support/licenses from http://shop.mysql.com/

[root@DEV-Blur-DB-1 ~]#

管理节点安装管理软件:

rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-management-7.0.5-0.rhel5.x86_64.rpm

[root@QA-pxy-1 ~]# rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-management-7.0.5-0.rhel5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-gpl-manag########################################### [100%]
[root@QA-pxy-1 ~]#

2. 配置连接参数

数据节点配置:

cat >/etc/my.cnf<<'EOF'
[mysql_cluster]
ndb-connectstring=qa-pxy-1.s3lab.mot.com   # location of management server
EOF

SQL节点配置:

cat >/etc/my.cnf<<'EOF'
[mysqld]
ndbcluster                                 # run NDB storage engine
ndb-connectstring=qa-pxy-1.s3lab.mot.com   # location of management server
EOF

管理节点配置:

mkdir -p /var/lib/mysql-cluster
cat >/var/lib/mysql-cluster/config.ini<<'EOF'
# Options affecting ndbd processes on all data nodes:
[ndbd default]
NoOfReplicas=2   # Number of replicas
DataMemory=80M   # How much memory to allocate for data storage
IndexMemory=18M  # How much memory to allocate for index storage
                 # For DataMemory and IndexMemory, we have used the
                 # default values. Since the "world" database takes up
                 # only about 500KB, this should be more than enough for
                 # this example Cluster setup.
# TCP/IP options:
[tcp default]
portnumber=2202  # This the default; however, you can use any port that is free
                 # for all the hosts in the cluster
                 # Note: It is recommended that you do not specify the port
                 # number at all and allow the default value to be used instead
# Management process options:
[ndb_mgmd]
hostname=qa-pxy-1.s3lab.mot.com # Hostname or IP address of management node
datadir=/var/lib/mysql-cluster  # Directory for management node log files
# Options for data node "A":
[ndbd]
                                   # (one [ndbd] section per data node)
hostname=dev24-db-1.s3lab.mot.com  # Hostname or IP address
datadir=/usr/local/mysql/data      # Directory for this data node's data files
# Options for data node "B":
[ndbd]
hostname=dev24-rpt-1.s3lab.mot.com # Hostname or IP address
datadir=/usr/local/mysql/data      # Directory for this data node's data files
# SQL node options:
[mysqld]
hostname=dev-blur-db-1.s3lab.mot.com # Hostname or IP address
                                     # (additional mysqld connections can be
                                     # specified for this node for various
                                     # purposes such as running ndb_restore)
EOF

3. 启动服务

启动管理节点:

ndb_mgmd -f /var/lib/mysql-cluster/config.ini

[root@QA-pxy-1 ~]# ndb_mgmd -f /var/lib/mysql-cluster/config.ini
2009-04-22 15:52:42 [MgmSrvr] INFO     -- NDB Cluster Management Server. mysql-5.1.32 ndb-7.0.5-beta
2009-04-22 15:52:42 [MgmSrvr] INFO     -- The default config directory '//mysql-cluster' does not exist. Trying to create it...
2009-04-22 15:52:42 [MgmSrvr] INFO     -- Sucessfully created config directory
2009-04-22 15:52:42 [MgmSrvr] INFO     -- Reading cluster configuration from '/var/lib/mysql-cluster/config.ini'

第一次启动后,在/mysql-cluster/下创建了二进制格式的配置文件

[root@QA-pxy-1 ~]# ll /mysql-cluster/
total 4
-rw-r--r-- 1 root root 3176 Apr 22 15:52 ndb_1_config.bin.1
[root@QA-pxy-1 ~]# file /mysql-cluster/ndb_1_config.bin.1
/mysql-cluster/ndb_1_config.bin.1: data
[root@QA-pxy-1 ~]#

启动数据节点:

mkdir -p /usr/local/mysql/data
ndbd

[root@DEV24-RPT-1 ~]# mkdir -p /usr/local/mysql/data
[root@DEV24-RPT-1 ~]# ndbd
2009-04-22 15:55:05 [ndbd] INFO     -- Configuration fetched from 'qa-pxy-1.s3lab.mot.com:1186', generation: 1
[root@DEV24-RPT-1 ~]#

启动SQL节点:

service mysql start

[root@DEV-Blur-DB-1 ~]# service mysql start
Starting MySQL..                                           [  OK  ]
[root@DEV-Blur-DB-1 ~]#

4. 管理

显示状态:
管理节点,显示状态

[root@QA-pxy-1 ~]# rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-tools-7.0.5-0.rhel5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-gpl-tools########################################### [100%]
[root@QA-pxy-1 ~]#

[root@QA-pxy-1 ~]# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=3    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=4    @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

[root@QA-pxy-1 ~]#

停止:
在管理节点上运行

ndb_mgm -e shutdown

此命令停止管理节点和数据节点的进程

[root@QA-pxy-1 ~]# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=3    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=4    @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

[root@QA-pxy-1 ~]# ndb_mgm -e shutdown
Connected to Management Server at: localhost:1186
2 NDB Cluster node(s) have shutdown.
Disconnecting to allow management server to shutdown.
[root@QA-pxy-1 ~]#

SQL节点上运行

mysqladmin shutdown

或

service mysql stop

停止MySQL服务

启动:
管理节点运行

ndb_mgmd

[root@QA-pxy-1 ~]# ndb_mgmd
2009-04-28 03:16:23 [MgmSrvr] INFO     -- NDB Cluster Management Server. mysql-5.1.32 ndb-7.0.5-beta
2009-04-28 03:16:23 [MgmSrvr] INFO     -- Loaded config from '//mysql-cluster/ndb_1_config.bin.1'
[root@QA-pxy-1 ~]#

读取的是二进制的配置文件

数据节点

[root@DEV24-DB-1 ~]# ndbd
2009-04-28 03:17:14 [ndbd] INFO     -- Configuration fetched from 'qa-pxy-1.s3lab.mot.com:1186', generation: 1
[root@DEV24-DB-1 ~]#

[root@DEV24-RPT-1 ~]# ndbd
2009-04-28 03:17:35 [ndbd] INFO     -- Configuration fetched from 'qa-pxy-1.s3lab.mot.com:1186', generation: 1
[root@DEV24-RPT-1 ~]#

SQL节点

[root@DEV-Blur-DB-1 /var/lib/mysql]# service mysql start
Starting MySQL.                                            [  OK  ]
[root@DEV-Blur-DB-1 /var/lib/mysql]# ps -ef|grep mysql
root      2860     1  0 03:20 pts/64   00:00:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --pid-file=/var/lib/mysql/DEV-Blur-DB-1.s3lab.mot.com.pid
mysql     2921  2860  1 03:20 pts/64   00:00:00 /usr/sbin/mysqld --basedir=/ --datadir=/var/lib/mysql --user=mysql --log-error=/var/lib/mysql/DEV-Blur-DB-1.s3lab.mot.com.err --pid-file=/var/lib/mysql/DEV-Blur-DB-1.s3lab.mot.com.pid
root      2946  1794  0 03:20 pts/64   00:00:00 grep mysql
[root@DEV-Blur-DB-1 /var/lib/mysql]#

5. 安装演示数据库

SQL节点安装演示数据库:

[root@DEV-Blur-DB-1 ~]# rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-client-7.0.5-0.rhel5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-gpl-clien########################################### [100%]
[root@DEV-Blur-DB-1 ~]#

sed 's/ENGINE=MyISAM/ENGINE=NDBCLUSTER/g' /u01/software/mysql/cluster/world.sql >/u01/software/mysql/cluster/world1.sql
mysql -u root <<EOF
CREATE DATABASE world;
USE world;
source /u01/software/mysql/cluster/world1.sql
exit
EOF

==================================================
测试停掉一个数据节点

1. 显示状态

[root@QA-pxy-1 ~]# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=3    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=4    @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

[root@QA-pxy-1 ~]#

2. 增加一个测试表
SQL节点

mysql -vvv -e "
use world;
create table a (a int) engine=ndbcluster;
insert into a values (1);
commit;
select * from a;
"

[root@DEV-Blur-DB-1 ~]# mysql -vvv -e "
> use world;
> create table a (a int) engine=ndbcluster;
> insert into a values (1);
> commit;
> select * from a;
> "
--------------
create table a (a int) engine=ndbcluster
--------------

Query OK, 0 rows affected (0.55 sec)

--------------
insert into a values (1)
--------------

Query OK, 1 row affected (0.00 sec)

--------------
commit
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
select * from a
--------------

+------+
| a    |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

Bye
[root@DEV-Blur-DB-1 ~]#

3. 停掉数据节点A

kill $(pgrep ndbd)

[root@QA-pxy-1 ~]# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2 (not connected, accepting connect from dev24-db-1.s3lab.mot.com)
id=3    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=4    @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

[root@QA-pxy-1 ~]#

管理节点日志

2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 2: Node shutdown completed. Initiated by signal 15.
2009-04-28 03:27:31 [MgmSrvr] ALERT    -- Node 1: Node 2 Disconnected
2009-04-28 03:27:31 [MgmSrvr] ALERT    -- Node 3: Node 2 Disconnected
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: Communication to Node 2 closed
2009-04-28 03:27:31 [MgmSrvr] ALERT    -- Node 3: Network partitioning - arbitration required
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: President restarts arbitration thread [state=7]
2009-04-28 03:27:31 [MgmSrvr] ALERT    -- Node 3: Arbitration won - positive reply from node 1
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: GCP Take over started
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: Node 3 taking over as DICT master
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: GCP Take over completed
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: kk: 231387/1 0 0
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: LCP Take over started
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: ParticipatingDIH = 0000000000000000
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: ParticipatingLQH = 0000000000000000
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=0 0000000000000000]
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: m_LCP_COMPLETE_REP_From_Master_Received = 1
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: LCP Take over completed (state = 4)
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: ParticipatingDIH = 0000000000000000
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: ParticipatingLQH = 0000000000000000
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: m_LCP_COMPLETE_REP_Counter_DIH = [SignalCounter: m_count=0 0000000000000000]
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: m_LCP_COMPLETE_REP_Counter_LQH = [SignalCounter: m_count=0 0000000000000000]
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: m_LAST_LCP_FRAG_ORD = [SignalCounter: m_count=0 0000000000000000]
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: m_LCP_COMPLETE_REP_From_Master_Received = 1
2009-04-28 03:27:31 [MgmSrvr] INFO     -- Node 3: Started arbitrator node 1 [ticket=4a27000255ecb5d3]
2009-04-28 03:27:35 [MgmSrvr] INFO     -- Node 3: Communication to Node 2 opened

数据节点A日志

2009-04-28 03:27:31 [ndbd] INFO     -- Received signal 15. Performing stop.
2009-04-28 03:27:31 [ndbd] INFO     -- Shutdown initiated
2009-04-28 03:27:31 [ndbd] INFO     -- Shutdown completed - exiting
2009-04-28 03:27:31 [ndbd] INFO     -- Angel shutting down
2009-04-28 03:27:31 [ndbd] INFO     -- Node 2: Node shutdown completed. Initiated by signal 15.

数据节点B日志

start_resend(0, empty bucket (231387/1 231387/0) -> active
execGCP_NOMORETRANS(231387/1) c_ongoing_take_over_cnt -> seize
completing gcp 231387/1 in execTAKE_OVERTCCONF

SQL节点日志

090428  3:27:35 [Note] NDB Binlog: Node: 2, down, Subscriber bitmask 00

4. 再插入一条记录

mysql -vvv -e "
use world;
insert into a values (2);
commit;
select * from a;
"

[root@DEV-Blur-DB-1 ~]# mysql -vvv -e "
> use world;
> insert into a values (2);
> commit;
> select * from a;
> "
--------------
insert into a values (2)
--------------

Query OK, 1 row affected (0.00 sec)

--------------
commit
--------------

Query OK, 0 rows affected (0.00 sec)

--------------
select * from a
--------------

+------+
| a    |
+------+
|    2 |
|    1 |
+------+
2 rows in set (0.00 sec)

Bye
[root@DEV-Blur-DB-1 ~]#

5. 启数据节点A

[root@DEV24-DB-1 ~]# ndbd
2009-04-28 03:32:44 [ndbd] INFO     -- Configuration fetched from 'qa-pxy-1.s3lab.mot.com:1186', generation: 1
[root@DEV24-DB-1 ~]#

[root@QA-pxy-1 ~]# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=2    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=3    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=4    @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

[root@QA-pxy-1 ~]#

管理节点日志

2009-04-28 03:32:44 [MgmSrvr] INFO     -- Mgmt server state: nodeid 2 reserved for ip 192.168.13.136, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000016.
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 1: Node 2 Connected
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 3: Node 2 Connected
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Node 3 Connected
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: CM_REGCONF president = 3, own Node = 2, our dynamic id = 3
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Node 3: API mysql-5.1.32 ndb-7.0.5
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Start phase 1 completed
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 3: Node 2: API mysql-5.1.32 ndb-7.0.5
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Start phase 2 completed (node restart)
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Start phase 3 completed (node restart)
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Start phase 4 completed (node restart)
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 3: DICT: locked by node 2 for NodeRestart
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000012.
2009-04-28 03:32:46 [MgmSrvr] INFO     -- Node 2: Receive arbitrator node 1 [ticket=4a27000255ecb5d3]
2009-04-28 03:32:47 [MgmSrvr] INFO     -- Node 2: Starting to restore schema
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: Restore of schema complete
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: DICT: activate index 8 done (sys/def/7/PRIMARY)
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: DICT: activate index 10 done (sys/def/9/PRIMARY)
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: DICT: activate index 12 done (sys/def/11/PRIMARY)
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: Node: 2 StartLog: [GCI Keep: 231073 LastCompleted: 231386 NewestRestorable: 231543]
2009-04-28 03:32:50 [MgmSrvr] INFO     -- Node 3: Local checkpoint 150 started. Keep GCI = 231100 oldest restorable GCI = 231100
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Suma: asking node 3 to recreate subscriptions on me
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Suma: node 3 has completed restoring me
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 5 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 6 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 7 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 8 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 9 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 100 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Node 1: API mysql-5.1.32 ndb-7.0.5
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Communication to Node 4 opened
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Node 4 Connected
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Node 4: API mysql-5.1.32 ndb-7.0.5
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Suma: initiate handover with nodes 0000000000000008 GCI: 231551
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Suma: handover from node 3 gci: 231551 buckets: 00000001 (2)
2009-04-28 03:33:01 [MgmSrvr] INFO     -- Node 2: Start phase 101 completed (node restart)
2009-04-28 03:33:01 [MgmSrvr] INFO     -- Node 2: Started (mysql-5.1.32 ndb-7.0.5)
2009-04-28 03:33:01 [MgmSrvr] INFO     -- Node 3: DICT: unlocked by node 2 for NodeRestart

数据节点A日志

2009-04-28 03:32:44 [ndbd] INFO     -- Angel pid: 10893 ndb pid: 10894
NDBMT: non-mt
2009-04-28 03:32:44 [ndbd] INFO     -- NDB Cluster -- DB node 2
2009-04-28 03:32:44 [ndbd] INFO     -- mysql-5.1.32 ndb-7.0.5-beta --
2009-04-28 03:32:44 [ndbd] INFO     -- Ndbd_mem_manager::init(1) min: 84Mb initial: 104Mb
Adding 104Mb to ZONE_LO (1,3327)
2009-04-28 03:32:45 [ndbd] INFO     -- Start initiated (mysql-5.1.32 ndb-7.0.5)
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
WOPool::init(61, 9)
RWPool::init(22, 13)
RWPool::init(42, 18)
RWPool::init(62, 13)
Using 1 fragments per node
RWPool::init(c2, 18)
RWPool::init(e2, 16)
WOPool::init(41, 8)
RWPool::init(82, 12)
RWPool::init(a2, 52)
WOPool::init(21, 10)
Found pending trans (0) - committing
commit create 13
restartCreateObj(1) file: 1
restartCreateObj(2) file: 1
restartCreateObj(3) file: 1
restartCreateObj(4) file: 1
restartCreateObj(5) file: 1
restartCreateObj(6) file: 1
restartCreateObj(7) file: 1
restartCreateObj(9) file: 1
restartCreateObj(11) file: 1
restartCreateObj(13) file: 1
tab: 2 frag: 0 replicaP->nextLcp: 1
scanning idx: 0 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(0) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 2 frag: 1 replicaP->nextLcp: 1
scanning idx: 0 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(0) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 3 frag: 0 replicaP->nextLcp: 1
scanning idx: 0 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(0) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 3 frag: 1 replicaP->nextLcp: 1
scanning idx: 0 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(0) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 4 frag: 0 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 4 frag: 1 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 5 frag: 0 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 5 frag: 1 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 6 frag: 0 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 6 frag: 1 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 7 frag: 0 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 7 frag: 1 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 9 frag: 0 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 9 frag: 1 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 11 frag: 0 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 11 frag: 1 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 149 231100
crashed replica: 0(1) replicaLastGci: 231386
Found LCP: 149(1) maxGciStarted: 231100 maxGciCompleted: 231100 restorable: 231386(231386) newestRestorableGCI: 231543
tab: 13 frag: 0 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 0
scanning idx: 0 lcpId: 0
- scanning idx: 2 lcpId: 0
Didnt find any LCP for node: 2 tab: 13 frag: 0
tab: 13 frag: 1 replicaP->nextLcp: 0
scanning idx: 1 lcpId: 0
scanning idx: 0 lcpId: 0
- scanning idx: 2 lcpId: 0
Didnt find any LCP for node: 2 tab: 13 frag: 1
RESTORE table: 2 1039 rows applied
RESTORE table: 2 1013 rows applied
RESTORE table: 3 5 rows applied
RESTORE table: 3 2 rows applied
RESTORE table: 4 5 rows applied
RESTORE table: 4 2 rows applied
RESTORE table: 5 2 rows applied
RESTORE table: 5 1 rows applied
RESTORE table: 6 0 rows applied
RESTORE table: 6 0 rows applied
RESTORE table: 7 2084 rows applied
RESTORE table: 7 1995 rows applied
RESTORE table: 9 117 rows applied
RESTORE table: 9 122 rows applied
RESTORE table: 11 496 rows applied
RESTORE table: 11 488 rows applied
alloc_chunk(111 16) -
2009-04-28 03:32:56 [ndbd] INFO     -- Suma: handover from node 3 gci: 231551 buckets: 00000001 (2)
231551/0 (231550/4294967295) switchover complete bucket 0 state: 1starting

数据节点B日志

2009-04-28 03:32:55 [ndbd] INFO     -- granting dict lock to 2
2009-04-28 03:32:55 [ndbd] INFO     -- clearing dict lock for 2
2009-04-28 03:32:56 [ndbd] INFO     -- granting dict lock to 2
prepare to handover bucket: 0
231551/0 (231550/4294967295) switchover complete bucket 0 state: 2handover
2009-04-28 03:33:01 [ndbd] INFO     -- clearing dict lock for 2

6. 查询

mysql -vvv -e "select * from world.a"

[root@DEV-Blur-DB-1 ~]# mysql -vvv -e "select * from world.a"
--------------
select * from world.a
--------------

+------+
| a    |
+------+
|    2 |
|    1 |
+------+
2 rows in set (0.01 sec)

Bye
[root@DEV-Blur-DB-1 ~]#

管理节点日志

2009-04-28 03:32:44 [MgmSrvr] INFO     -- Mgmt server state: nodeid 2 reserved for ip 192.168.13.136, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000016.
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 1: Node 2 Connected
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 3: Node 2 Connected
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Node 3 Connected
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: CM_REGCONF president = 3, own Node = 2, our dynamic id = 3
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Node 3: API mysql-5.1.32 ndb-7.0.5
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Start phase 1 completed
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 3: Node 2: API mysql-5.1.32 ndb-7.0.5
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Start phase 2 completed (node restart)
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Start phase 3 completed (node restart)
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 2: Start phase 4 completed (node restart)
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Node 3: DICT: locked by node 2 for NodeRestart
2009-04-28 03:32:45 [MgmSrvr] INFO     -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000012.
2009-04-28 03:32:46 [MgmSrvr] INFO     -- Node 2: Receive arbitrator node 1 [ticket=4a27000255ecb5d3]
2009-04-28 03:32:47 [MgmSrvr] INFO     -- Node 2: Starting to restore schema
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: Restore of schema complete
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: DICT: activate index 8 done (sys/def/7/PRIMARY)
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: DICT: activate index 10 done (sys/def/9/PRIMARY)
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: DICT: activate index 12 done (sys/def/11/PRIMARY)
2009-04-28 03:32:48 [MgmSrvr] INFO     -- Node 2: Node: 2 StartLog: [GCI Keep: 231073 LastCompleted: 231386 NewestRestorable: 231543]
2009-04-28 03:32:50 [MgmSrvr] INFO     -- Node 3: Local checkpoint 150 started. Keep GCI = 231100 oldest restorable GCI = 231100
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Suma: asking node 3 to recreate subscriptions on me
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Suma: node 3 has completed restoring me
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 5 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 6 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 7 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 8 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 9 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Start phase 100 completed (node restart)
2009-04-28 03:32:55 [MgmSrvr] INFO     -- Node 2: Node 1: API mysql-5.1.32 ndb-7.0.5
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Communication to Node 4 opened
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Node 4 Connected
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Node 4: API mysql-5.1.32 ndb-7.0.5
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Suma: initiate handover with nodes 0000000000000008 GCI: 231551
2009-04-28 03:32:56 [MgmSrvr] INFO     -- Node 2: Suma: handover from node 3 gci: 231551 buckets: 00000001 (2)
2009-04-28 03:33:01 [MgmSrvr] INFO     -- Node 2: Start phase 101 completed (node restart)
2009-04-28 03:33:01 [MgmSrvr] INFO     -- Node 2: Started (mysql-5.1.32 ndb-7.0.5)
2009-04-28 03:33:01 [MgmSrvr] INFO     -- Node 3: DICT: unlocked by node 2 for NodeRestart

==================================================
测试增加2个节点(1个节点组)
参考文档:Adding MySQL Cluster Data Nodes Online

1. 安装数据节点
安装软件

rpm -Uvh /u01/software/mysql/cluster/MySQL-Cluster-gpl-storage-7.0.5-0.rhel5.x86_64.rpm

配置参数

cat >/etc/my.cnf<<'EOF'
[mysql_cluster]
ndb-connectstring=qa-pxy-1.s3lab.mot.com   # location of management server
EOF

创建数据文件目录

mkdir -p /usr/local/mysql/data

2. 管理节点现有的配置文件
现有的配置文件如下

cat >/var/lib/mysql-cluster/config.ini<<'EOF'
[ndbd default]
NoOfReplicas=2
DataMemory=200M
IndexMemory=200M
DataDir=/usr/local/mysql/data
[ndbd]
Id=1
HostName=dev24-db-1.s3lab.mot.com
[ndbd]
Id=2
HostName=dev24-rpt-1.s3lab.mot.com
[mgm]
Id=10
HostName=qa-pxy-1.s3lab.mot.com
DataDir=/var/lib/mysql-cluster
[api]
Id=20
HostName=dev-blur-db-1.s3lab.mot.com
EOF

配置文件中mgm等于ndb_mgmd, api等于mysqld. 参考:MySQL Cluster Configuration Files

[root@QA-pxy-1 ~]# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     2 node(s)
id=1    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=2    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=10   @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=20   @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

[root@QA-pxy-1 ~]#

3. 创建个测试表

USE world;
drop table if exists ips;
CREATE TABLE ips (
    id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    country_code CHAR(2) NOT NULL,
    type CHAR(4) NOT NULL,
    ip_address varchar(15) NOT NULL,
    addresses BIGINT UNSIGNED DEFAULT NULL,
    date BIGINT UNSIGNED DEFAULT NULL
)   ENGINE NDBCLUSTER;
insert into ips (country_code, type, ip_address, addresses, date)
select a.Code2, substring(a.Name,1,4), substring(a.Name,1,15), round(a.GNP), a.IndepYear
from Country a, Country b
order by rand()
limit 30000;
insert into ips (country_code, type, ip_address, addresses, date)
select a.Code2, substring(a.Name,1,4), substring(a.Name,1,15), round(a.GNP), a.IndepYear
from Country a, Country b
order by rand()
limit 20000;
select count(*) from ips;

mysql> USE world;
Database changed
mysql> drop table if exists ips;
Query OK, 0 rows affected (0.54 sec)

mysql> CREATE TABLE ips (
    ->     id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    ->     country_code CHAR(2) NOT NULL,
    ->     type CHAR(4) NOT NULL,
    ->     ip_address varchar(15) NOT NULL,
    ->     addresses BIGINT UNSIGNED DEFAULT NULL,
    ->     date BIGINT UNSIGNED DEFAULT NULL
    -> )   ENGINE NDBCLUSTER;
Query OK, 0 rows affected (0.69 sec)

mysql> insert into ips (country_code, type, ip_address, addresses, date)
    -> select a.Code2, substring(a.Name,1,4), substring(a.Name,1,15), round(a.GNP), a.IndepYear
    -> from Country a, Country b
    -> order by rand()
    -> limit 30000;
Query OK, 30000 rows affected, 385 warnings (2.25 sec)
Records: 30000  Duplicates: 0  Warnings: 385

mysql> insert into ips (country_code, type, ip_address, addresses, date)
    -> select a.Code2, substring(a.Name,1,4), substring(a.Name,1,15), round(a.GNP), a.IndepYear
    -> from Country a, Country b
    -> order by rand()
    -> limit 20000;
Query OK, 20000 rows affected, 243 warnings (1.36 sec)
Records: 20000  Duplicates: 0  Warnings: 243

mysql> select count(*) from ips;
+----------+
| count(*) |
+----------+
|    50000 |
+----------+
1 row in set (0.00 sec)

mysql>

4. 编辑管理节点配置文件
增加2个数据节点

[ndbd]
Id=3
HostName=qa-blur-db-1.s3lab.mot.com
[ndbd]
Id=4
HostName=dev-db-1.s3lab.mot.com

cat >>/var/lib/mysql-cluster/config.ini <<EOF
[ndbd]
Id=3
HostName=qa-blur-db-1.s3lab.mot.com
[ndbd]
Id=4
HostName=dev-db-1.s3lab.mot.com
EOF

5. 重启管理节点
停止管理节点

[root@QA-pxy-1 ~]# ndb_mgm -e "10 stop"
Connected to Management Server at: localhost:1186
Node 10 has shutdown.
Disconnecting to allow Management Server to shutdown

[root@QA-pxy-1 ~]#

重启

[root@QA-pxy-1 ~]# ndb_mgmd -f /var/lib/mysql-cluster/config.ini --reload
2009-04-28 08:41:37 [MgmSrvr] INFO     -- NDB Cluster Management Server. mysql-5.1.32 ndb-7.0.5-beta
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Loaded config from '//mysql-cluster/ndb_10_config.bin.1'
[root@QA-pxy-1 ~]#

日志

2009-04-28 08:41:37 [MgmSrvr] INFO     -- Reading cluster configuration from '/var/lib/mysql-cluster/config.ini'
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Detected change of /var/lib/mysql-cluster/config.ini on disk, will try to set it when all ndb_mgmd(s) started. This is the actual diff:
[ndbd(DB)]
NodeId=3
Node removed

[ndbd(DB)]
NodeId=4
Node removed

[TCP]
NodeId1=1
NodeId2=3
Connection removed

[TCP]
NodeId1=1
NodeId2=4
Connection removed

[TCP]
NodeId1=2
NodeId2=3
Connection removed

[TCP]
NodeId1=2
NodeId2=4
Connection removed

[TCP]
NodeId1=3
NodeId2=4
Connection removed

[TCP]
NodeId1=20
NodeId2=3
Connection removed

[TCP]
NodeId1=20
NodeId2=4
Connection removed

[TCP]
NodeId1=10
NodeId2=3
Connection removed

[TCP]
NodeId1=10
NodeId2=4
Connection removed


2009-04-28 08:41:37 [MgmSrvr] INFO     -- Mgmt server state: nodeid 10 reserved for ip 192.168.12.61, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000000400.
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Node 10: Node 10 Connected
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Id: 10, Command port: *:1186
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Starting configuration change, generation: 1
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Configuration 2 commited
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Config change completed! New generation: 2
2009-04-28 08:41:37 [MgmSrvr] WARNING  -- Failed to convert connection from '192.168.13.141:63669' to transporter
2009-04-28 08:41:37 [MgmSrvr] WARNING  -- Failed to convert connection from '192.168.13.136:24626' to transporter
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Node 10: Node 2 Connected
2009-04-28 08:41:37 [MgmSrvr] INFO     -- Node 10: Node 1 Connected
2009-04-28 08:41:38 [MgmSrvr] INFO     -- Node 2: Prepare arbitrator node 10 [ticket=65970004570bca52]
2009-04-28 08:41:38 [MgmSrvr] INFO     -- Node 1: Started arbitrator node 10 [ticket=65970004570bca52]

状态

[root@QA-pxy-1 ~]# ndb_mgm -e show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=1    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=2    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=3 (not connected, accepting connect from qa-blur-db-1.s3lab.mot.com)
id=4 (not connected, accepting connect from dev-db-1.s3lab.mot.com)

[ndb_mgmd(MGM)] 1 node(s)
id=10   @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=20   @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

[root@QA-pxy-1 ~]#

6. 滚动重启(rolling restart)现有的数据节点

ndb_mgm> 1 restart
ndb_mgm> 2 restart

[root@QA-pxy-1 ~]# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=1    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=2    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=3 (not connected, accepting connect from qa-blur-db-1.s3lab.mot.com)
id=4 (not connected, accepting connect from dev-db-1.s3lab.mot.com)

[ndb_mgmd(MGM)] 1 node(s)
id=10   @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=20   @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

ndb_mgm> 1 restart
Node 1: Node shutdown initiated
Node 1: Node shutdown completed, restarting, no start.
Node 1 is being restarted

ndb_mgm> Node 1: Start initiated (version 7.0.5)
Node 1: Data usage decreased to 0%(0 32K pages of total 3200)
Node 1: Started (version 7.0.5)

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=1    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=2    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=3 (not connected, accepting connect from qa-blur-db-1.s3lab.mot.com)
id=4 (not connected, accepting connect from dev-db-1.s3lab.mot.com)

[ndb_mgmd(MGM)] 1 node(s)
id=10   @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=20   @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

ndb_mgm> 2 restart
Node 2: Node shutdown initiated
Node 2: Node shutdown completed, restarting, no start.
Node 2 is being restarted

ndb_mgm> Node 2: Start initiated (version 7.0.5)
Node 2: Data usage decreased to 0%(0 32K pages of total 3200)
Node 2: Started (version 7.0.5)

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=1    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=2    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=3 (not connected, accepting connect from qa-blur-db-1.s3lab.mot.com)
id=4 (not connected, accepting connect from dev-db-1.s3lab.mot.com)

[ndb_mgmd(MGM)] 1 node(s)
id=10   @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=20   @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

ndb_mgm>

注意:运行每个restart命令后,一定要等"Node ... Started (version ..."出现后, 才可进行下步restart操作

7. 滚动重启SQL节点

[root@DEV-Blur-DB-1 ~]# service mysql restart
Shutting down MySQL..                                      [  OK  ]
Starting MySQL..                                           [  OK  ]
[root@DEV-Blur-DB-1 ~]#

8. 启动新的数据节点

ndbd --initial

启数据节点C

[root@QA-Blur-DB-1 ~]# ndbd --initial
2009-04-28 09:12:44 [ndbd] INFO     -- Configuration fetched from 'qa-pxy-1.s3lab.mot.com:1186', generation: 2
[root@QA-Blur-DB-1 ~]#

日志:

2009-04-28 09:12:44 [ndbd] INFO     -- Angel pid: 5370 ndb pid: 5371
NDBMT: non-mt
2009-04-28 09:12:44 [ndbd] INFO     -- NDB Cluster -- DB node 3
2009-04-28 09:12:44 [ndbd] INFO     -- mysql-5.1.32 ndb-7.0.5-beta --
2009-04-28 09:12:44 [ndbd] INFO     -- Ndbd_mem_manager::init(1) min: 104Mb initial: 124Mb
Adding 124Mb to ZONE_LO (1,3967)
2009-04-28 09:12:44 [ndbd] INFO     -- Start initiated (mysql-5.1.32 ndb-7.0.5)
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
WOPool::init(61, 9)
RWPool::init(22, 13)
RWPool::init(42, 18)
RWPool::init(62, 13)
Using 1 fragments per node
RWPool::init(c2, 18)
RWPool::init(e2, 16)
WOPool::init(41, 8)
RWPool::init(82, 12)
RWPool::init(a2, 52)
WOPool::init(21, 10)
execSTART_RECREQ chaning srnodes from 000000000000000e to 0000000000000008

管理节点日志:

2009-04-28 09:12:44 [MgmSrvr] INFO     -- Mgmt server state: nodeid 3 reserved for ip 192.168.18.61, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000100408.
2009-04-28 09:12:44 [MgmSrvr] INFO     -- Node 10: Node 3 Connected
2009-04-28 09:12:44 [MgmSrvr] INFO     -- Mgmt server state: nodeid 3 freed, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000100400.
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 1: Node 3 Connected
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Node 1 Connected
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Node 2 Connected
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: CM_REGCONF president = 1, own Node = 3, our dynamic id = 5
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 1: Node 3: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 2: Node 3 Connected
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 2: Node 3: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Node 1: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Node 2: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Start phase 1 completed
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Start phase 2 completed (initial node restart)
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 1: DICT: locked by node 3 for NodeRestart
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Start phase 3 completed (initial node restart)
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Local redo log file initialization status:
#Total files: 64, Completed: 0
#Total MBytes: 1024, Completed: 0
2009-04-28 09:12:45 [MgmSrvr] INFO     -- Node 3: Receive arbitrator node 10 [ticket=6a2800065712888f]
2009-04-28 09:12:52 [MgmSrvr] INFO     -- Node 3: Local redo log file initialization completed:
#Total files: 64, Completed: 64
#Total MBytes: 1024, Completed: 1024
2009-04-28 09:12:52 [MgmSrvr] INFO     -- Node 3: Start phase 4 completed (initial node restart)
2009-04-28 09:12:54 [MgmSrvr] INFO     -- Node 3: Starting to restore schema
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 3: Restore of schema complete
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 3: DICT: activate index 8 done (sys/def/7/PRIMARY)
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 3: DICT: activate index 10 done (sys/def/9/PRIMARY)
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 3: DICT: activate index 12 done (sys/def/11/PRIMARY)
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 3: DICT: activate index 14 done (sys/def/13/PRIMARY)
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 1: Adding node 3 to sysfile, NS_Configured
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 3: Node: 3 StartLog: [GCI Keep: 2955 LastCompleted: 0 NewestRestorable: 3726]
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 1: Local checkpoint blocked waiting for node-restart
2009-04-28 09:12:58 [MgmSrvr] INFO     -- Node 1: Local checkpoint 21 started. Keep GCI = 3001 oldest restorable GCI = 3006
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Suma: asking node 1 to recreate subscriptions on me
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Suma: node 1 has completed restoring me
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Start phase 5 completed (initial node restart)
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Start phase 6 completed (initial node restart)
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Start phase 7 completed (initial node restart)
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Start phase 8 completed (initial node restart)
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Start phase 9 completed (initial node restart)
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Start phase 100 completed (initial node restart)
2009-04-28 09:13:03 [MgmSrvr] INFO     -- Node 3: Node 10: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:13:04 [MgmSrvr] INFO     -- Node 3: Communication to Node 20 opened
2009-04-28 09:13:04 [MgmSrvr] INFO     -- Node 3: Node 20 Connected
2009-04-28 09:13:04 [MgmSrvr] INFO     -- Node 1: DICT: unlocked by node 3 for NodeRestart
2009-04-28 09:13:04 [MgmSrvr] INFO     -- Node 3: Node 20: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:13:04 [MgmSrvr] INFO     -- Node 3: Start phase 101 completed (initial node restart)
2009-04-28 09:13:04 [MgmSrvr] INFO     -- Node 3: Started (mysql-5.1.32 ndb-7.0.5)

数据节点A(Master)日志:

2009-04-28 09:13:03 [ndbd] INFO     -- granting dict lock to 3
2009-04-28 09:13:03 [ndbd] INFO     -- clearing dict lock for 3

启数据节点D

[root@DEV-DB-1 ~]# ndbd --initial
2009-04-28 09:17:08 [ndbd] INFO     -- Configuration fetched from 'qa-pxy-1.s3lab.mot.com:1186', generation: 2
[root@DEV-DB-1 ~]#

日志:

2009-04-28 09:17:08 [ndbd] INFO     -- Angel pid: 4732 ndb pid: 4733
NDBMT: non-mt
2009-04-28 09:17:08 [ndbd] INFO     -- NDB Cluster -- DB node 4
2009-04-28 09:17:08 [ndbd] INFO     -- mysql-5.1.32 ndb-7.0.5-beta --
2009-04-28 09:17:08 [ndbd] INFO     -- Ndbd_mem_manager::init(1) min: 104Mb initial: 124Mb
Adding 124Mb to ZONE_LO (1,3967)
2009-04-28 09:17:08 [ndbd] INFO     -- Start initiated (mysql-5.1.32 ndb-7.0.5)
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
NDBFS/AsyncFile: Allocating 310392 for In/Deflate buffer
WOPool::init(61, 9)
RWPool::init(22, 13)
RWPool::init(42, 18)
RWPool::init(62, 13)
Using 1 fragments per node
RWPool::init(c2, 18)
RWPool::init(e2, 16)
WOPool::init(41, 8)
RWPool::init(82, 12)
RWPool::init(a2, 52)
WOPool::init(21, 10)
execSTART_RECREQ chaning srnodes from 000000000000001e to 0000000000000010

管理节点日志:

2009-04-28 09:17:08 [MgmSrvr] INFO     -- Mgmt server state: nodeid 4 reserved for ip 192.168.14.30, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000100410.
2009-04-28 09:17:08 [MgmSrvr] INFO     -- Node 10: Node 4 Connected
2009-04-28 09:17:08 [MgmSrvr] INFO     -- Node 4: Start phase 0 completed
2009-04-28 09:17:08 [MgmSrvr] INFO     -- Node 4: Communication to Node 1 opened
2009-04-28 09:17:08 [MgmSrvr] INFO     -- Node 4: Communication to Node 2 opened
2009-04-28 09:17:08 [MgmSrvr] INFO     -- Node 4: Communication to Node 3 opened
2009-04-28 09:17:08 [MgmSrvr] INFO     -- Node 4: Initial start, waiting for 000000000000000e to connect,  nodes [ all: 000000000000001e connected: 0000000000000010 no-wait: 0000000000000000 ]
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Mgmt server state: nodeid 4 freed, m_reserved_nodes 0000000000000000000000000000000000000000000000000000000000100400.
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 1: Node 4 Connected
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 3: Node 4 Connected
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Node 1 Connected
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Node 2 Connected
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Node 3 Connected
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: CM_REGCONF president = 1, own Node = 4, our dynamic id = 6
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 1: Node 4: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 2: Node 4 Connected
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 2: Node 4: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 3: Node 4: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Node 1: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Node 2: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Node 3: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Start phase 1 completed
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Start phase 2 completed (initial node restart)
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 1: DICT: locked by node 4 for NodeRestart
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Start phase 3 completed (initial node restart)
2009-04-28 09:17:09 [MgmSrvr] INFO     -- Node 4: Local redo log file initialization status:
#Total files: 64, Completed: 0
#Total MBytes: 1024, Completed: 0
2009-04-28 09:17:10 [MgmSrvr] INFO     -- Node 4: Receive arbitrator node 10 [ticket=6a2800065712888f]
2009-04-28 09:17:15 [MgmSrvr] INFO     -- Node 4: Local redo log file initialization completed:
#Total files: 64, Completed: 64
#Total MBytes: 1024, Completed: 1024
2009-04-28 09:17:15 [MgmSrvr] INFO     -- Node 4: Start phase 4 completed (initial node restart)
2009-04-28 09:17:17 [MgmSrvr] INFO     -- Node 4: Starting to restore schema
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 4: Restore of schema complete
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 4: DICT: activate index 8 done (sys/def/7/PRIMARY)
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 4: DICT: activate index 10 done (sys/def/9/PRIMARY)
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 4: DICT: activate index 12 done (sys/def/11/PRIMARY)
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 4: DICT: activate index 14 done (sys/def/13/PRIMARY)
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 1: Adding node 4 to sysfile, NS_Configured
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 1: Local checkpoint blocked waiting for node-restart
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 4: Node: 4 StartLog: [GCI Keep: 3001 LastCompleted: 0 NewestRestorable: 3855]
2009-04-28 09:17:20 [MgmSrvr] INFO     -- Node 1: Local checkpoint 22 started. Keep GCI = 3727 oldest restorable GCI = 3169
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Suma: asking node 1 to recreate subscriptions on me
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Suma: node 1 has completed restoring me
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Start phase 5 completed (initial node restart)
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Start phase 6 completed (initial node restart)
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Start phase 7 completed (initial node restart)
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Start phase 8 completed (initial node restart)
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Start phase 9 completed (initial node restart)
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Start phase 100 completed (initial node restart)
2009-04-28 09:17:25 [MgmSrvr] INFO     -- Node 4: Node 10: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:17:26 [MgmSrvr] INFO     -- Node 4: Communication to Node 20 opened
2009-04-28 09:17:26 [MgmSrvr] INFO     -- Node 4: Node 20 Connected
2009-04-28 09:17:26 [MgmSrvr] INFO     -- Node 1: DICT: unlocked by node 4 for NodeRestart
2009-04-28 09:17:26 [MgmSrvr] INFO     -- Node 4: Node 20: API mysql-5.1.32 ndb-7.0.5
2009-04-28 09:17:26 [MgmSrvr] INFO     -- Node 4: Start phase 101 completed (initial node restart)
2009-04-28 09:17:26 [MgmSrvr] INFO     -- Node 4: Started (mysql-5.1.32 ndb-7.0.5)

数据节点A(Master)日志:

2009-04-28 09:17:25 [ndbd] INFO     -- granting dict lock to 4
2009-04-28 09:17:25 [ndbd] INFO     -- clearing dict lock for 4

状态

ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=1    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=2    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=3    @192.168.18.61  (mysql-5.1.32 ndb-7.0.5, no nodegroup)
id=4    @192.168.14.30  (mysql-5.1.32 ndb-7.0.5, no nodegroup)

[ndb_mgmd(MGM)] 1 node(s)
id=10   @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=20   @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

ndb_mgm>

现在,新增的2个节点还不属于任何节点组

9. 创建节点组(Nodegroup)

create nodegroup 3,4

ndb_mgm> create nodegroup 3,4
Nodegroup 1 created
ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=1    @192.168.13.136  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0, Master)
id=2    @192.168.13.141  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 0)
id=3    @192.168.18.61  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 1)
id=4    @192.168.14.30  (mysql-5.1.32 ndb-7.0.5, Nodegroup: 1)

[ndb_mgmd(MGM)] 1 node(s)
id=10   @192.168.12.61  (mysql-5.1.32 ndb-7.0.5)

[mysqld(API)]   1 node(s)
id=20   @192.168.18.60  (mysql-5.1.32 ndb-7.0.5)

ndb_mgm>

被分配到节点组1

SQL节点日志

handle_change_nodegroup(add, cnt=2,gci=3965/9)
 - NOT FOUND
handle_change_nodegroup(add, cnt=2,gci=3965/9)
 - NOT FOUND
handle_change_nodegroup(add, cnt=2,gci=3965/9)
 - NOT FOUND
handle_change_nodegroup(add, cnt=2,gci=3965/9)
 - NOT FOUND

数据节点C日志

3965/9 (3965/8) switchover complete bucket 0 state: 210add 2
3965/9 (3965/8) switchover complete bucket 1 state: 210add 2
alloc_chunk(65 16) -

数据节点D日志

3965/9 (3965/8) switchover complete bucket 0 state: 210add 2
3965/9 (3965/8) switchover complete bucket 1 state: 210add 2
alloc_chunk(65 16) -

10. 查看数据分区
节点组建立后, 已有的数据和索引不会自动重新分配到新节点上, 可以用report命令看到

all report memory

ndb_mgm> all report memory

Node 1: Data usage is 6%(216 32K pages of total 3200)
ndb_mgm> Node 1: Index usage is 0%(122 8K pages of total 12832)
Node 2: Data usage is 6%(216 32K pages of total 3200)
Node 2: Index usage is 0%(122 8K pages of total 12832)
Node 3: Data usage is 0%(16 32K pages of total 3200)
Node 3: Index usage is 0%(0 8K pages of total 12832)
Node 4: Data usage is 0%(16 32K pages of total 3200)
Node 4: Index usage is 0%(0 8K pages of total 12832)

ndb_mgm>

使用ndb_desc命令查看表的分区情况

ndb_desc ips -d world -p

[root@QA-pxy-1 ~]# ndb_desc ips -d world -p
-- ips --
Version: 9
Fragment type: 9
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 6
Number of primary keys: 1
Length of frm data: 339
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 2
TableStatus: Retrieved
-- Attributes --
id Bigint PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR
country_code Char(2;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
type Char(4;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
ip_address Varchar(15;latin1_swedish_ci) NOT NULL AT=SHORT_VAR ST=MEMORY
addresses Bigunsigned NULL AT=FIXED ST=MEMORY
date Bigunsigned NULL AT=FIXED ST=MEMORY

-- Indexes --
PRIMARY KEY(id) - UniqueHashIndex
PRIMARY(id) - OrderedIndex

-- Per partition info --
Partition       Row count       Commit count    Frag fixed memory       Frag varsized memory
0               24925           24925           1507328                 491520
1               25075           25075           1540096                 524288

NDBT_ProgramExit: 0 - OK

[root@QA-pxy-1 ~]#

看到只有2个分区

注意:如果报错

[root@QA-pxy-1 ~]# ndb_desc ips -d world -p
Configuration error: Error : Could not alloc node id at localhost port 1186: Connection done from wrong host ip 127.0.0.1.
Unable to connect to management server.

NDBT_ProgramExit: 1 - Failed

[root@QA-pxy-1 ~]#

或

[root@QA-pxy-1 ~]# ndb_desc ips -d world -p
Configuration error: Error : Could not alloc node id at localhost port 1186: No free node id found for mysqld(API).
Unable to connect to management server.

NDBT_ProgramExit: 1 - Failed

[root@QA-pxy-1 ~]#

或

[root@QA-pxy-1 ~]# ndb_desc ips -d world -p --ndb-nodeid=5
Configuration error: Error : Could not alloc node id at localhost port 1186: Id 5 already allocated by another node.
Unable to connect to management server.

NDBT_ProgramExit: 1 - Failed

[root@QA-pxy-1 ~]#

将配置文件中mysqld配置项指定的ID去掉,用自动分配, 并且增加一个空的预留的mysqld配置项, 即可(要重启管理节点,使配置文件生效)

[ndbd default]
NoOfReplicas=2
DataMemory=100M
IndexMemory=100M
DataDir=/usr/local/mysql/data
[ndbd]
Id=1
HostName=dev24-db-1.s3lab.mot.com
[ndbd]
Id=2
HostName=dev24-rpt-1.s3lab.mot.com
[ndbd]
Id=3
HostName=qa-blur-db-1.s3lab.mot.com
[ndbd]
Id=4
HostName=dev-db-1.s3lab.mot.com
[ndb_mgmd]
Id=10
HostName=qa-pxy-1.s3lab.mot.com
DataDir=/var/lib/mysql-cluster
[mysqld]
HostName=dev-blur-db-1.s3lab.mot.com
[mysqld]
#spare api for ndb_* to connect to

11. 重新分配数据
使用命令:

ALTER [ONLINE] TABLE ... REORGANIZE PARTITION

mysql> alter online table world.ips reorganize partition;
Query OK, 0 rows affected (11.88 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql>

日志

2009-04-28 10:34:25 [MgmSrvr] INFO     -- Node 3: reorg-copy table 13 processed 0 rows
2009-04-28 10:34:25 [MgmSrvr] INFO     -- Node 4: reorg-copy table 13 processed 0 rows
2009-04-28 10:34:26 [MgmSrvr] INFO     -- Node 1: Local checkpoint 26 started. Keep GCI = 5626 oldest restorable GCI = 5626
2009-04-28 10:34:28 [MgmSrvr] INFO     -- Node 1: reorg-copy table 13 processed 24925 rows
2009-04-28 10:34:28 [MgmSrvr] INFO     -- Node 2: reorg-copy table 13 processed 25075 rows
2009-04-28 10:34:31 [MgmSrvr] INFO     -- Node 1: Local checkpoint 27 started. Keep GCI = 5876 oldest restorable GCI = 5878
2009-04-28 10:34:35 [MgmSrvr] INFO     -- Node 3: reorg-delete table 13 processed 0 rows
2009-04-28 10:34:35 [MgmSrvr] INFO     -- Node 4: reorg-delete table 13 processed 0 rows
2009-04-28 10:34:36 [MgmSrvr] INFO     -- Node 1: reorg-delete table 13 processed 12503 rows
2009-04-28 10:34:36 [MgmSrvr] INFO     -- Node 2: reorg-delete table 13 processed 12481 rows

查看状态

ndb_mgm> all report memory

Node 1: Data usage is 6%(214 32K pages of total 3200)
ndb_mgm> Node 1: Index usage is 0%(90 8K pages of total 12832)
Node 2: Data usage is 6%(214 32K pages of total 3200)
Node 2: Index usage is 0%(90 8K pages of total 12832)
Node 3: Data usage is 2%(92 32K pages of total 3200)
Node 3: Index usage is 0%(47 8K pages of total 12832)
Node 4: Data usage is 2%(92 32K pages of total 3200)
Node 4: Index usage is 0%(47 8K pages of total 12832)

ndb_mgm>

[root@QA-pxy-1 ~]# ndb_desc ips -d world -p
-- ips --
Version: 16777225
Fragment type: 9
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 6
Number of primary keys: 1
Length of frm data: 341
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 4
TableStatus: Retrieved
-- Attributes --
id Bigint PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY AUTO_INCR
country_code Char(2;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
type Char(4;latin1_swedish_ci) NOT NULL AT=FIXED ST=MEMORY
ip_address Varchar(15;latin1_swedish_ci) NOT NULL AT=SHORT_VAR ST=MEMORY
addresses Bigunsigned NULL AT=FIXED ST=MEMORY
date Bigunsigned NULL AT=FIXED ST=MEMORY

-- Indexes --
PRIMARY KEY(id) - UniqueHashIndex
PRIMARY(id) - OrderedIndex

-- Per partition info --
Partition       Row count       Commit count    Frag fixed memory       Frag varsized memory
0               12422           49931           1507328                 491520
1               12594           50037           1540096                 524288
3               12481           12481           753664                  262144
2               12503           12503           753664                  262144

NDBT_ProgramExit: 0 - OK

[root@QA-pxy-1 ~]#

是4个分区了

对所有ndb表都重新分配一下
查看所有ndb表,用命令

SELECT TABLE_SCHEMA, TABLE_NAME 
    FROM INFORMATION_SCHEMA.TABLES
    WHERE ENGINE = 'NDBCLUSTER';

mysql> SELECT TABLE_SCHEMA, TABLE_NAME
    ->     FROM INFORMATION_SCHEMA.TABLES
    ->     WHERE ENGINE = 'NDBCLUSTER';
+--------------+------------------+
| TABLE_SCHEMA | TABLE_NAME       |
+--------------+------------------+
| mysql        | ndb_apply_status |
| world        | City             |
| world        | Country          |
| world        | CountryLanguage  |
| world        | ips              |
+--------------+------------------+
5 rows in set (0.20 sec)

mysql>

12. 回收原有数据占用的空间
最后用OPTIMIZE TABLE命令回收空间

mysql> OPTIMIZE TABLE world.ips;
+-----------+----------+----------+----------+
| Table     | Op       | Msg_type | Msg_text |
+-----------+----------+----------+----------+
| world.ips | optimize | status   | OK       |
+-----------+----------+----------+----------+
1 row in set (5.63 sec)

mysql>

好像没有效果?

==================================================
数据节点有时报错, 不知何故

2009-04-28 09:49:36 [ndbd] ERROR    -- Error: : 1006
2009-04-28 09:49:36 [ndbd] ERROR    -- TransporterRegistry.cpp: 1850

==================================================
MySQL Cluster is a high availability database which leverages a shared-nothing data storage architecture. The system consists of multiple nodes which can be distributed across hosts to ensure continuous availability in the event of a data node, hardware or network failure. MySQL Cluster Carrier Grade Edition uses a storage engine, consisting of a set of data nodes to store data, which is accessed through a native C++ API, Java, LDAP or standard SQL interface.
MySQL集群是一种使用了非共享数据的存储结构的高可用的数据库. 系统由多个节点组成, 可跨主机分布, 当一个数据节点发生硬件或网络故障时, 可保证连续的可用性. MySQL集群运营级版本(Carrier Grade Edition)使用了一种由一组数据节点组成, 存储数据的存储引擎, 它通过本地的C++ API,Java,LDAP或标准SQL接口访问.

MySQL Cluster consists of three different types of nodes, each providing specialized services within the cluster.
MySQL集群由三种不同的节点组成, 在集群中每种节点提供了专门的服务

Data Nodes are the main nodes of the cluster, providing the following functionality:
* Data storage and management of both in-memory and disk-based data
* Automatic and user defined partitioning of data, even as new nodes are added to a running cluster
* Synchronous replication of data between data nodes
* Transactions and data retrieval
* Automatic fail over
* Resynchronization after failure
数据节点是集群中主要的节点, 提供了下列功能:
在内存和磁盘中的数据存储和管理
自动和用户定义的数据分区,甚至在新节点加入集群中时
数据节点之间的同步复制
事务和数据恢复
自动故障切换
故障之后重新同步

By storing and distributing data in a shared-nothing architecture, i.e. without the use of a shared-disk, if a Data Node happens to fail, there will always at least one additional Data Node storing the same information. This allows for requests and transactions to continue to be satisfied without interruption. Data Nodes can also be added on-line, allowing for unprecedented scalability of data capacity and processing.
数据存储和分布在非共享结构中, 就是说, 不使用共享的磁盘, 如果一个数据节点发生故障, 总会有至少一个另外的数据节点存储同样的信息. 这使得请求和事务不经中断得以继续. 数据节点也能够在线增加, 考虑到数据容量和处理的空前的扩展性

Application Nodes are the applications connecting to the database. This can take the form of an application leveraging the high performance NDB API or the use of MySQL Servers which perform the function of SQL interfaces into the data stored within a cluster. Thus, applications can simultaneously access the data in MySQL Cluster using a rich set of interfaces, such as SQL, LDAP and web services. Moreover, additional Application Nodes can be added online.
应用节点是连接数据库的应用程序. 应用程序访问集群内存储的数据的形式可以是采用高性能的NDB API或使用MySQL服务器运行SQL接口的功能. 因此, 影城程序可以同时使用大量的接口访问MySQL集群中的数据, 比如SQL, LDAP和web服务(web services). 此外, 额外的应用节点可以被在线地增加.

==================================================
常见问题

选自Chapter 16. MySQL 5.1 FAQ — MySQL Cluster
这个FAQ好像比较老, 有些没有及时更新, 比如说, 它说不能在线添加数据节点, 其实Cluster版本7是可以的
所以仅供参考

16.10: How much RAM do I need to use MySQL Cluster? Is it possible to use disk memory at all?

In MySQL 5.1, Cluster is in-memory only. This means that all table data (including indexes) is stored in RAM. Therefore, if your data takes up 1 GB of space and you want to replicate it once in the cluster, you need 2 GB of memory to do so (1 GB per replica). This is in addition to the memory required by the operating system and any applications running on the cluster computers.

16.12: Can I run MySQL Cluster nodes inside virtual machines (such as those created by VMWare, Parallels, or Xen)?

This is possible but not recommended for a production environment.

We have found that running MySQL Cluster processes inside a virtual machine can give rise to issues with timing and disk subsystems that have a strong negative impact on the operation of the cluster. The behavior of the cluster is often unpredictable in these cases.

If an issue can be reproduced outside the virtual environment, then we may be able to provide assistance. Otherwise, we cannot support it at this time.

16.17: Is MySQL Cluster transaction-safe? What isolation levels are supported?

Yes: For tables created with the NDB storage engine, transactions are supported. Currently, MySQL Cluster supports only the READ COMMITTED transaction isolation level.

16.34: How do I handle MySQL users in a MySQL Cluster having multiple MySQL servers?

MySQL user accounts and privileges are not automatically propagated between different MySQL servers accessing the same MySQL Cluster. Therefore, you must make sure that these are copied between the SQL nodes yourself. You can do this manually, or automate the task with scripts.
Warning

Do not attempt to work around this issue by converting the MySQL system tables to use the NDBCLUSTER storage engine. Only the MyISAM storage engine is supported for these tables.

外部链接:
MySQL Cluster
Chapter 17. MySQL Cluster NDB 6.X/7.X

-fin-

谢太数据库

谢太的最爱

Blog Archive

Tuesday, April 28, 2009

setting up MySQL Cluster 配置MySQL集群

No comments:

Followers

谢太数据库

订阅谢太数据库

谢太的最爱

Blog Archive

Tuesday, April 28, 2009

setting up MySQL Cluster 配置MySQL集群

No comments:

Followers