maatkit介绍 - 并行导出
Maatkit是一组为MySQL提供的命令行工具集, 是由Percona公司开发的开源软件
该公司同时还开发了XtraDB存储引擎, XtraBackup热备工具, MySQL增强补丁版本Percona
包括:
- mk-archiver 将表数据清除或归档到另外的表或文件
- mk-audit 分析MySQL的配置,概要,操作, 生成报表
- mk-checksum-filter mk-table-checksum的过滤器
- mk-deadlock-logger 记录InnoDB的死锁信息
- mk-duplicate-key-checker 查找重复或冗余的外键和索引
- mk-fifo-split 将一个文件拆分为多个部分, 输出到FIFO管道(有用吗?)
- mk-find 按指定规则查找表名, 然后执行操作
- mk-heartbeat 监视数据库之间复制的延迟
- mk-log-player 拆分并重演慢速查询日志
- mk-parallel-dump 多线程导出
- mk-parallel-restore 多线程导入
- mk-profile-compact 压缩mk-query-profiler输出
- mk-query-digest 分析日志
- mk-query-profiler 查询性能分析工具
- mk-show-grants 显示用户权限
- mk-slave-delay 实现备库与主库之间一定的延时
- mk-slave-find 查找/显示出备库的树型层次结构
- mk-slave-move 在层次结构中移动备库(什么玩意?)
- mk-slave-prefetch 在备库上运行SELECT查询语句, 使数据预读取到内存中
- mk-slave-restart 监测备库发生的错误并重启
- mk-table-checksum 快速检测两个表的数据是否相同. 可以用来检测备库和主库的数据一致性
- mk-table-sync 发现并修复不同服务器上的两个表之间的数据差异
- mk-upgrade 比较2个数据库中语句的运行结果
- mk-visual-explain 以树形显示执行计划
1. 安装
从http://maatkit.googlecode.com/处下载RPM包进行安装
yum -y install perl-TermReadKey.x86_64 rpm -Uvh http://maatkit.googlecode.com/files/maatkit-4623-1.noarch.rpm
依赖以下安装包
# rpm -q --requires maatkit /usr/bin/env perl(DBD::mysql) >= 1.0 perl(DBI) perl(DBI) >= 1.13 perl(Data::Dumper) perl(Digest::MD5) perl(English) perl(Exporter) perl(File::Basename) perl(File::Find) perl(File::Spec) perl(File::Temp) perl(Getopt::Long) perl(IO::File) perl(List::Util) perl(POSIX) perl(Socket) perl(Term::ReadKey) >= 2.10 perl(Time::HiRes) perl(Time::Local) perl(constant) perl(sigtrap) perl(strict) perl(warnings) rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
2. 并行导出mk-parallel-dump和导入mk-parallel-restore
使用方法见:
mk-parallel-dump - Dump sets of MySQL tables in parallel.
mk-parallel-restore - Load files into MySQL in parallel.
1) 生成测试表
生成一个3百万行记录的测试表
mysql --socket=/var/lib/mysql/data_3306/mysql.sock
set autocommit=0;
drop database if exists dbtest;
create database dbtest;
use dbtest;
drop table if exists t1;
create table t1 (
id int(9) not null auto_increment,
name varchar(20) not null,
age int(3) not null,
notes varchar(100),
primary key (id),
index ind_t1_name (name)
);
truncate table t1;
insert into t1 (name, age, notes)
select conv(floor(rand() * 99999999999999), 10, 36), floor(1+rand()*(100-1)), md5(rand())
from information_schema.COLUMNS a
, information_schema.COLUMNS b
, information_schema.COLUMNS c
limit 3000000;
commit;
大小300多Mmysql> insert into t1 (name, age, notes)
-> select conv(floor(rand() * 99999999999999), 10, 36), floor(1+rand()*(100-1)), md5(rand())
-> from information_schema.COLUMNS a
-> , information_schema.COLUMNS b
-> , information_schema.COLUMNS c
-> limit 3000000;
Query OK, 3000000 rows affected (3 min 30.61 sec)
Records: 3000000 Duplicates: 0 Warnings: 0
mysql> commit;
Query OK, 0 rows affected (0.07 sec)
mysql> show table status like 't1';
+------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+----------------------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+----------------------+
| t1 | InnoDB | 10 | Compact | 3000249 | 76 | 228294656 | 0 | 86654976 | 0 | 6000000 | 2009-09-23 05:26:34 | NULL | NULL | utf8_general_ci | NULL | | InnoDB free: 4096 kB |
+------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+----------------------+
1 row in set (0.08 sec)
mysql> system ls -l /var/lib/mysql/data_3306/dbtest
total 319820
-rw-rw---- 1 mysql mysql 61 Sep 23 04:17 db.opt
-rw-rw---- 1 mysql mysql 8646 Sep 23 05:26 t1.frm
-rw-rw---- 1 mysql mysql 327155712 Sep 23 05:30 t1.ibd
mysql>
2) mk-parallel-dump和mysqldump分别导出该表作比较
mysqldump导出表
# mkdir -p $HOME/backup # cd $HOME/backup && rm -rf * # time mysqldump --socket=/var/lib/mysql/data_3306/mysql.sock --opt dbtest t1 >dbback-dbtest-t1.sql real 0m11.316s user 0m2.348s sys 0m0.472s # ls -l dbback-dbtest-t1.sql -rw-r--r-- 1 root root 179090589 Sep 23 05:31 dbback-dbtest-t1.sql
mk-parallel-dump导出表, 导出文件放在目录$HOME/backup/pdump1下, 不压缩, 不记录binlog的位置
# time mk-parallel-dump --socket=/var/lib/mysql/data_3306/mysql.sock --base-dir=$HOME/backup/pdump1 --no-gzip --nobin-log-position --tables="dbtest.t1"
default: 1 tables, 1 chunks, 1 successes, 0 failures, 11.40 wall-clock time, 11.31 dump time
real 0m11.608s
user 0m2.176s
sys 0m0.556s
# find pdump1 -ls
7162198 4 drwxr-xr-x 3 root root 4096 Sep 23 05:33 pdump1
7162199 4 drwxr-xr-x 3 root root 4096 Sep 23 05:33 pdump1/default
7162200 4 drwxr-xr-x 2 root root 4096 Sep 23 05:33 pdump1/default/dbtest
7162201 175076 -rw-r--r-- 1 root root 179096039 Sep 23 05:33 pdump1/default/dbtest/t1.000000.sql
mk-parallel-dump导出表, 增加每一百万条(近似值)导出一个文件
# time mk-parallel-dump --socket=/var/lib/mysql/data_3306/mysql.sock --base-dir=$HOME/backup/pdump2 --no-gzip --nobin-log-position --tables="dbtest.t1" --chunk-size=1000000
default: 1 tables, 4 chunks, 4 successes, 0 failures, 10.36 wall-clock time, 15.92 dump time
real 0m10.509s
user 0m2.580s
sys 0m0.560s
# find pdump2 -ls
7162202 4 drwxr-xr-x 3 root root 4096 Sep 23 05:33 pdump2
7162203 4 drwxr-xr-x 3 root root 4096 Sep 23 05:33 pdump2/default
7162204 4 drwxr-xr-x 2 root root 4096 Sep 23 05:33 pdump2/default/dbtest
7162205 4 -rw-r--r-- 1 root root 101 Sep 23 05:33 pdump2/default/dbtest/t1.chunks
7162208 58544 -rw-r--r-- 1 root root 59879876 Sep 23 05:33 pdump2/default/dbtest/t1.000001.sql
7162206 16 -rw-r--r-- 1 root root 16365 Sep 23 05:33 pdump2/default/dbtest/t1.000003.sql
7162209 58000 -rw-r--r-- 1 root root 59324125 Sep 23 05:33 pdump2/default/dbtest/t1.000000.sql
7162207 58544 -rw-r--r-- 1 root root 59880064 Sep 23 05:33 pdump2/default/dbtest/t1.000002.sql
# cat pdump2/default/dbtest/t1.chunks
`id` < 1999835
`id` >= 1999835 AND `id` < 3999669
`id` >= 3999669 AND `id` < 5999503
`id` >= 5999503
.chunks文件记录了分块规则mk-parallel-dump导出表, 增加启动4个线程同时导出(不指定则默认为2个线程)
# time mk-parallel-dump --socket=/var/lib/mysql/data_3306/mysql.sock --base-dir=$HOME/backup/pdump3 --no-gzip --nobin-log-position --tables="dbtest.t1" --chunk-size=1000000 --threads=4
default: 1 tables, 4 chunks, 4 successes, 0 failures, 9.37 wall-clock time, 25.29 dump time
real 0m9.529s
user 0m2.572s
sys 0m0.516s
# find pdump3 -ls
7359077 4 drwxr-xr-x 3 root root 4096 Sep 23 05:34 pdump3
7359078 4 drwxr-xr-x 3 root root 4096 Sep 23 05:34 pdump3/default
7359079 4 drwxr-xr-x 2 root root 4096 Sep 23 05:34 pdump3/default/dbtest
7359080 4 -rw-r--r-- 1 root root 101 Sep 23 05:34 pdump3/default/dbtest/t1.chunks
7359084 58544 -rw-r--r-- 1 root root 59879876 Sep 23 05:34 pdump3/default/dbtest/t1.000001.sql
7359081 16 -rw-r--r-- 1 root root 16365 Sep 23 05:34 pdump3/default/dbtest/t1.000003.sql
7359083 58000 -rw-r--r-- 1 root root 59324125 Sep 23 05:34 pdump3/default/dbtest/t1.000000.sql
7359082 58544 -rw-r--r-- 1 root root 59880064 Sep 23 05:34 pdump3/default/dbtest/t1.000002.sql
导出速度差不多, 无显著差异, 因为这是个单CPU的系统, 只导出一个表, 分块和多线程可能还会带来额外的开销.
如果是多核多CPU系统导出多个表, mk-parallel-dump应该会更快些.
3) 比较mk-parallel-dump和mysqldump导入
mysql导入
# time mysql --socket=/var/lib/mysql/data_3306/mysql.sock dbtest <dbback-dbtest-t1.sql real 3m16.760s user 0m1.672s sys 0m0.156s
mk-parallel-restore导入
# time mk-parallel-restore --socket=/var/lib/mysql/data_3306/mysql.sock $HOME/backup/pdump1
1 tables, 1 files, 1 successes, 0 failures, 199.75 wall-clock time, 199.75 load time
real 3m19.910s
user 0m0.232s
sys 0m0.136s
mk-parallel-restore导入多个文件
# mysql --socket=/var/lib/mysql/data_3306/mysql.sock -e "drop table dbtest.t1;"
# time mk-parallel-restore --socket=/var/lib/mysql/data_3306/mysql.sock $HOME/backup/pdump2
1 tables, 4 files, 1 successes, 0 failures, 196.55 wall-clock time, 196.54 load time
real 3m16.653s
user 0m0.268s
sys 0m0.148s
mk-parallel-restore导入多个文件, 启4个线程
# mysql --socket=/var/lib/mysql/data_3306/mysql.sock -e "drop table dbtest.t1;"
# time mk-parallel-restore --socket=/var/lib/mysql/data_3306/mysql.sock $HOME/backup/pdump3 --threads=4
1 tables, 4 files, 1 successes, 0 failures, 194.19 wall-clock time, 194.19 load time
real 3m14.606s
user 0m0.204s
sys 0m0.164s
速度也都差不多
总体感觉很一般啊, 以后再试试其它工具. 关于数据库复制, 日志/语句分析等工具可能还比较有用.
外部链接:
Tools for MySQL - Maatkit makes MySQL - easier to manage.
maatkit - A toolkit that provides advanced functionality for MySQL
mysql-parallel-dump test
-fin-


