Locations of visitors to this page

Friday, October 30, 2009

how to resync a broken master-master replication

how to resync a broken master-master replication

[still in draft]

db01:3306 and db03:3306 are setting up with master-master replicaition.

in db03 the slave process is accidentally stopped for quite a long time that caused by a wrong configuration, after resolving the problem, dual master replication seems to start working, but unfortunately binary logs were delete older than x days, many tables has got out of date or even lost in db03, recreating the slave db is not acceptable, we have to find a more efficient way to make thier data resync again.


1. stop replication from db03 to db01
login db01
show slave status\G
stop slave;
show slave status\G


2. dump sql script for recreating table structure later
login db01
DB_LIST=$(mysql -u root --batch --skip-column-names -e 'show databases;'|sed '/^information_schema$\|^mysql$\|^test$/d'|tr '\n' ' ')
mysqldump -u root --no-data --skip-add-drop-table --databases $DB_LIST >$HOME/db01-3306-db.sql

login db03
DB_LIST=$(mysql -u root --batch --skip-column-names -e 'show databases;'|sed '/^information_schema$\|^mysql$\|^test$/d'|tr '\n' ' ')
mysqldump -u root --no-data --skip-add-drop-table --databases $DB_LIST >$HOME/db03-3306-db.sql


3. recreate lost tables
login db03
scp db01:db01-3306-db.sql $HOME/
mysql -u root --force <$HOME/db01-3306-db.sql

login db01
scp db03:db03-3306-db.sql $HOME/
mysql -u root --force <$HOME/db03-3306-db.sql


4. solve slave sql issues in db03, if any
login db03
show slave status\G
set global sql_slave_skip_counter=1;
start slave;
show slave status\G

now db01 and db03 are actually working in master-slave replication.


5. check data consistency between db01 and db03
run mk-table-checksum against db01, it will execute sql queries to calculate crc checksum of tables in db01, then in db02 the same queries will run too, and calculate checksum in db02, finally we will get checksums of both databases.

mk-table-checksum --ignore-databases=mysql,test --replicate mysql.checksum --empty-replicate-table --create-replicate-table h=db01,P=3306,u=webuser


6. sync table records from db01 to db03
run mk-table-sync with dry-run or print option, see what will happen in db03, it will not actually take any actions.

mk-table-sync --dry-run --ignore-databases=mysql,test --sync-to-master --replicate=mysql.checksum h=db03,P=3306,u=webuser
mk-table-sync --print --ignore-databases=mysql,test --sync-to-master --replicate=mysql.checksum h=db03,P=3306,u=webuser

actually sync table data
mk-table-sync --execute --ignore-databases=mysql,test --sync-to-master --replicate=mysql.checksum h=db03,P=3306,u=webuser


7. known issue
"Can't make changes on the master: no unique index exists at /usr/bin/mk-table-sync line 3775. while doing dbtest.t"
solution: create unique index on the table
it is not a perfect solution, because there may be no suitable columns for defining an unique index, or there may be duplicate records in the table.


8. check to see if all data in both databases are identical
mk-table-checksum --ignore-databases=mysql,test -P 3306 -u webuser db01 db03|mk-checksum-filter


9. start replication from db03 to db01
login db01
show slave status\G
start slave;
show slave status\G


10. external links:

mk-table-checksum
mk-table-sync
How to sync tables in master-master MySQL replication
MySQL Master-Master replication table sync
Maatkit Options for Restoring a Slave or Master
Tip of the Day — Checking Your Tables — Part I
Sync up a MySQL slave with mk-table-sync
mk-table-sync and small tables



-fin-

No comments:

Website Analytics

Followers