maatkit介绍 - 并行导出
Maatkit是一组为MySQL提供的命令行工具集, 是由Percona公司开发的开源软件
该公司同时还开发了XtraDB存储引擎, XtraBackup热备工具, MySQL增强补丁版本Percona
包括:
- mk-archiver 将表数据清除或归档到另外的表或文件
- mk-audit 分析MySQL的配置,概要,操作, 生成报表
- mk-checksum-filter mk-table-checksum的过滤器
- mk-deadlock-logger 记录InnoDB的死锁信息
- mk-duplicate-key-checker 查找重复或冗余的外键和索引
- mk-fifo-split 将一个文件拆分为多个部分, 输出到FIFO管道(有用吗?)
- mk-find 按指定规则查找表名, 然后执行操作
- mk-heartbeat 监视数据库之间复制的延迟
- mk-log-player 拆分并重演慢速查询日志
- mk-parallel-dump 多线程导出
- mk-parallel-restore 多线程导入
- mk-profile-compact 压缩mk-query-profiler输出
- mk-query-digest 分析日志
- mk-query-profiler 查询性能分析工具
- mk-show-grants 显示用户权限
- mk-slave-delay 实现备库与主库之间一定的延时
- mk-slave-find 查找/显示出备库的树型层次结构
- mk-slave-move 在层次结构中移动备库(什么玩意?)
- mk-slave-prefetch 在备库上运行SELECT查询语句, 使数据预读取到内存中
- mk-slave-restart 监测备库发生的错误并重启
- mk-table-checksum 快速检测两个表的数据是否相同. 可以用来检测备库和主库的数据一致性
- mk-table-sync 发现并修复不同服务器上的两个表之间的数据差异
- mk-upgrade 比较2个数据库中语句的运行结果
- mk-visual-explain 以树形显示执行计划
1. 安装
从http://maatkit.googlecode.com/处下载RPM包进行安装
yum -y install perl-TermReadKey.x86_64 rpm -Uvh http://maatkit.googlecode.com/files/maatkit-4623-1.noarch.rpm
依赖以下安装包
# rpm -q --requires maatkit /usr/bin/env perl(DBD::mysql) >= 1.0 perl(DBI) perl(DBI) >= 1.13 perl(Data::Dumper) perl(Digest::MD5) perl(English) perl(Exporter) perl(File::Basename) perl(File::Find) perl(File::Spec) perl(File::Temp) perl(Getopt::Long) perl(IO::File) perl(List::Util) perl(POSIX) perl(Socket) perl(Term::ReadKey) >= 2.10 perl(Time::HiRes) perl(Time::Local) perl(constant) perl(sigtrap) perl(strict) perl(warnings) rpmlib(CompressedFileNames) <= 3.0.4-1 rpmlib(PayloadFilesHavePrefix) <= 4.0-1
2. 并行导出mk-parallel-dump和导入mk-parallel-restore
使用方法见:
mk-parallel-dump - Dump sets of MySQL tables in parallel.
mk-parallel-restore - Load files into MySQL in parallel.
1) 生成测试表
生成一个3百万行记录的测试表
mysql --socket=/var/lib/mysql/data_3306/mysql.sock set autocommit=0; drop database if exists dbtest; create database dbtest; use dbtest; drop table if exists t1; create table t1 ( id int(9) not null auto_increment, name varchar(20) not null, age int(3) not null, notes varchar(100), primary key (id), index ind_t1_name (name) ); truncate table t1; insert into t1 (name, age, notes) select conv(floor(rand() * 99999999999999), 10, 36), floor(1+rand()*(100-1)), md5(rand()) from information_schema.COLUMNS a , information_schema.COLUMNS b , information_schema.COLUMNS c limit 3000000; commit;大小300多M
mysql> insert into t1 (name, age, notes) -> select conv(floor(rand() * 99999999999999), 10, 36), floor(1+rand()*(100-1)), md5(rand()) -> from information_schema.COLUMNS a -> , information_schema.COLUMNS b -> , information_schema.COLUMNS c -> limit 3000000; Query OK, 3000000 rows affected (3 min 30.61 sec) Records: 3000000 Duplicates: 0 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.07 sec) mysql> show table status like 't1'; +------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+----------------------+ | Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment | +------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+----------------------+ | t1 | InnoDB | 10 | Compact | 3000249 | 76 | 228294656 | 0 | 86654976 | 0 | 6000000 | 2009-09-23 05:26:34 | NULL | NULL | utf8_general_ci | NULL | | InnoDB free: 4096 kB | +------+--------+---------+------------+---------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+----------------------+ 1 row in set (0.08 sec) mysql> system ls -l /var/lib/mysql/data_3306/dbtest total 319820 -rw-rw---- 1 mysql mysql 61 Sep 23 04:17 db.opt -rw-rw---- 1 mysql mysql 8646 Sep 23 05:26 t1.frm -rw-rw---- 1 mysql mysql 327155712 Sep 23 05:30 t1.ibd mysql>
2) mk-parallel-dump和mysqldump分别导出该表作比较
mysqldump导出表
# mkdir -p $HOME/backup # cd $HOME/backup && rm -rf * # time mysqldump --socket=/var/lib/mysql/data_3306/mysql.sock --opt dbtest t1 >dbback-dbtest-t1.sql real 0m11.316s user 0m2.348s sys 0m0.472s # ls -l dbback-dbtest-t1.sql -rw-r--r-- 1 root root 179090589 Sep 23 05:31 dbback-dbtest-t1.sql
mk-parallel-dump导出表, 导出文件放在目录$HOME/backup/pdump1下, 不压缩, 不记录binlog的位置
# time mk-parallel-dump --socket=/var/lib/mysql/data_3306/mysql.sock --base-dir=$HOME/backup/pdump1 --no-gzip --nobin-log-position --tables="dbtest.t1" default: 1 tables, 1 chunks, 1 successes, 0 failures, 11.40 wall-clock time, 11.31 dump time real 0m11.608s user 0m2.176s sys 0m0.556s # find pdump1 -ls 7162198 4 drwxr-xr-x 3 root root 4096 Sep 23 05:33 pdump1 7162199 4 drwxr-xr-x 3 root root 4096 Sep 23 05:33 pdump1/default 7162200 4 drwxr-xr-x 2 root root 4096 Sep 23 05:33 pdump1/default/dbtest 7162201 175076 -rw-r--r-- 1 root root 179096039 Sep 23 05:33 pdump1/default/dbtest/t1.000000.sql
mk-parallel-dump导出表, 增加每一百万条(近似值)导出一个文件
# time mk-parallel-dump --socket=/var/lib/mysql/data_3306/mysql.sock --base-dir=$HOME/backup/pdump2 --no-gzip --nobin-log-position --tables="dbtest.t1" --chunk-size=1000000 default: 1 tables, 4 chunks, 4 successes, 0 failures, 10.36 wall-clock time, 15.92 dump time real 0m10.509s user 0m2.580s sys 0m0.560s # find pdump2 -ls 7162202 4 drwxr-xr-x 3 root root 4096 Sep 23 05:33 pdump2 7162203 4 drwxr-xr-x 3 root root 4096 Sep 23 05:33 pdump2/default 7162204 4 drwxr-xr-x 2 root root 4096 Sep 23 05:33 pdump2/default/dbtest 7162205 4 -rw-r--r-- 1 root root 101 Sep 23 05:33 pdump2/default/dbtest/t1.chunks 7162208 58544 -rw-r--r-- 1 root root 59879876 Sep 23 05:33 pdump2/default/dbtest/t1.000001.sql 7162206 16 -rw-r--r-- 1 root root 16365 Sep 23 05:33 pdump2/default/dbtest/t1.000003.sql 7162209 58000 -rw-r--r-- 1 root root 59324125 Sep 23 05:33 pdump2/default/dbtest/t1.000000.sql 7162207 58544 -rw-r--r-- 1 root root 59880064 Sep 23 05:33 pdump2/default/dbtest/t1.000002.sql # cat pdump2/default/dbtest/t1.chunks `id` < 1999835 `id` >= 1999835 AND `id` < 3999669 `id` >= 3999669 AND `id` < 5999503 `id` >= 5999503.chunks文件记录了分块规则
mk-parallel-dump导出表, 增加启动4个线程同时导出(不指定则默认为2个线程)
# time mk-parallel-dump --socket=/var/lib/mysql/data_3306/mysql.sock --base-dir=$HOME/backup/pdump3 --no-gzip --nobin-log-position --tables="dbtest.t1" --chunk-size=1000000 --threads=4 default: 1 tables, 4 chunks, 4 successes, 0 failures, 9.37 wall-clock time, 25.29 dump time real 0m9.529s user 0m2.572s sys 0m0.516s # find pdump3 -ls 7359077 4 drwxr-xr-x 3 root root 4096 Sep 23 05:34 pdump3 7359078 4 drwxr-xr-x 3 root root 4096 Sep 23 05:34 pdump3/default 7359079 4 drwxr-xr-x 2 root root 4096 Sep 23 05:34 pdump3/default/dbtest 7359080 4 -rw-r--r-- 1 root root 101 Sep 23 05:34 pdump3/default/dbtest/t1.chunks 7359084 58544 -rw-r--r-- 1 root root 59879876 Sep 23 05:34 pdump3/default/dbtest/t1.000001.sql 7359081 16 -rw-r--r-- 1 root root 16365 Sep 23 05:34 pdump3/default/dbtest/t1.000003.sql 7359083 58000 -rw-r--r-- 1 root root 59324125 Sep 23 05:34 pdump3/default/dbtest/t1.000000.sql 7359082 58544 -rw-r--r-- 1 root root 59880064 Sep 23 05:34 pdump3/default/dbtest/t1.000002.sql
导出速度差不多, 无显著差异, 因为这是个单CPU的系统, 只导出一个表, 分块和多线程可能还会带来额外的开销.
如果是多核多CPU系统导出多个表, mk-parallel-dump应该会更快些.
3) 比较mk-parallel-dump和mysqldump导入
mysql导入
# time mysql --socket=/var/lib/mysql/data_3306/mysql.sock dbtest <dbback-dbtest-t1.sql real 3m16.760s user 0m1.672s sys 0m0.156s
mk-parallel-restore导入
# time mk-parallel-restore --socket=/var/lib/mysql/data_3306/mysql.sock $HOME/backup/pdump1 1 tables, 1 files, 1 successes, 0 failures, 199.75 wall-clock time, 199.75 load time real 3m19.910s user 0m0.232s sys 0m0.136s
mk-parallel-restore导入多个文件
# mysql --socket=/var/lib/mysql/data_3306/mysql.sock -e "drop table dbtest.t1;" # time mk-parallel-restore --socket=/var/lib/mysql/data_3306/mysql.sock $HOME/backup/pdump2 1 tables, 4 files, 1 successes, 0 failures, 196.55 wall-clock time, 196.54 load time real 3m16.653s user 0m0.268s sys 0m0.148s
mk-parallel-restore导入多个文件, 启4个线程
# mysql --socket=/var/lib/mysql/data_3306/mysql.sock -e "drop table dbtest.t1;" # time mk-parallel-restore --socket=/var/lib/mysql/data_3306/mysql.sock $HOME/backup/pdump3 --threads=4 1 tables, 4 files, 1 successes, 0 failures, 194.19 wall-clock time, 194.19 load time real 3m14.606s user 0m0.204s sys 0m0.164s
速度也都差不多
总体感觉很一般啊, 以后再试试其它工具. 关于数据库复制, 日志/语句分析等工具可能还比较有用.
外部链接:
Tools for MySQL - Maatkit makes MySQL - easier to manage.
maatkit - A toolkit that provides advanced functionality for MySQL
mysql-parallel-dump test
-fin-