Delayed replication allows a replication slave to deliberately lag behind the master by at least a specified amount of time. Before executing an event, the slave will first wait, if necessary, until the given time has passed since the event was created on the master. The result is that the slave will reflect the state of the master some time back in the past. This feature is supported since MySQL 5.6 and MariaDB 10.2.3. It can come in handy in case of accidental data deletion, and should be part of your disaster recovery plan. Show The problem when setting up a delayed replication slave is how much delay we should put on. Too short of time and you risk the bad query getting to your delayed slave before you can get to it, thus wasting the point of having the delayed slave. Optionally, you can have your delayed time to be so long that it take hours for your delayed slave to catch up to where the master was at the time of the error. Luckily with Docker, process isolation is its strength. Running multiple MySQL instances is pretty convenient with Docker. It allows us to have multiple delayed slaves within a single physical host to improve our recovery time and save hardware resources. If you think a 15-minute delay is too short, we can have another instance with 1-hour delay or 6-hour for an even older snapshot of our database. In this blog post, we are going to deploy multiple MySQL delayed slaves on one single physical host with Docker, and show some recovery scenarios. The following diagram illustrates our final architecture that we want to build: Our architecture consists of an already deployed 2-node MySQL Replication running on physical servers (blue) and we would like to set up another three MySQL slaves (green) with following behaviour:
Take note that we are going to have 3 copies of the exact same data on the same physical server. Ensure our Docker host has the storage required, so do allocate sufficient disk space beforehand. MySQL Master PreparationFirstly, login to the master server and create the replication user:
Then, create a PITR-compatible backup on the master:
If you are using ClusterControl, you can make a PITR-compatible backup easily. Go to Backups -> Create Backup and pick “Complete PITR-compatible” under the “Dump Type” dropdown: Finally, transfer this backup to the Docker host:
This backup file will be used by the MySQL slave containers during the slave bootstrapping process, as shown in the next section. Delayed Slave DeploymentPrepare our Docker container directories. Create 3 directories (mysql.conf.d, datadir and sql) for every MySQL container that we are going to launch (you can use loop to simplify the commands below):
“mysql.conf.d” directory will store our custom MySQL configuration file and will be mapped into the container under /etc/mysql.conf.d. “datadir” is where we want Docker to store the MySQL data directory, which maps to /var/lib/mysql of the container and “sql” directory stores our SQL files – backup files in .sql or .sql.gz format to stage the slave before replicating and also .sql files to automate the replication configuration and startup. 15-minute Delayed SlavePrepare the MySQL configuration file for our 15-minute delayed slave:
And add the following lines:
** The server-id value we used for this slave is 10015. Next, under /storage/mysql-slave-15m/sql directory, create two SQL files, one to RESET MASTER (1reset_master.sql) and another one to establish the replication link using CHANGE MASTER statement (3setup_slave.sql). Create a text file 1reset_master.sql and add the following line:
Create a text file 3setup_slave.sql and add the following lines:
MASTER_DELAY=900 is equal to 15 minutes (in seconds). Then copy the backup file taken from our master (that has been transferred into our Docker host) to the “sql” directory and renamed it as 2mysqldump_complete.sql.gz:
The final look of our “sql” directory should be something like this:
Take note that we prefix the SQL filename with an integer to determine the execution order when Docker initializes the MySQL container. Once everything is in place, run the MySQL container for our 15-minute delayed slave: 0** The MYSQL_ROOT_PASSWORD value must be the same as the MySQL root password on the master. The following lines are what we are looking for to verify if MySQL is running correctly and connected as a slave to our master (192.168.55.171): 1You can then verify the replication status with following statement: 2At this point, our 15-minute delayed slave container is replicating correctly and our architecture is looking something like this: 1-hour Delayed SlavePrepare the MySQL configuration file for our 1-hour delayed slave: 3And add the following lines: 4** The server-id value we used for this slave is 10060. Next, under /storage/mysql-slave-1h/sql directory, create two SQL files, one to RESET MASTER (1reset_master.sql) and another one to establish the replication link using CHANGE MASTER statement (3setup_slave.sql). Create a text file 1reset_master.sql and add the following line:
Create a text file 3setup_slave.sql and add the following lines: 6MASTER_DELAY=3600 is equal to 1 hour (in seconds). Then copy the backup file taken from our master (that has been transferred into our Docker host) to the “sql” directory and renamed it as 2mysqldump_complete.sql.gz: 7The final look of our “sql” directory should be something like this: 8Take note that we prefix the SQL filename with an integer to determine the execution order when Docker initializes the MySQL container. Once everything is in place, run the MySQL container for our 1-hour delayed slave: 9** The MYSQL_ROOT_PASSWORD value must be the same as the MySQL root password on the master. The following lines are what we are looking for to verify if MySQL is running correctly and connected as a slave to our master (192.168.55.171): 0You can then verify the replication status with following statement: 1At this point, our 15-minute and 1-hour MySQL delayed slave containers are replicating from the master and our architecture is looking something like this: 6-hour Delayed SlavePrepare the MySQL configuration file for our 6-hour delayed slave:
And add the following lines: 3** The server-id value we used for this slave is 10006. Next, under /storage/mysql-slave-6h/sql directory, create two SQL files, one to RESET MASTER (1reset_master.sql) and another one to establish the replication link using CHANGE MASTER statement (3setup_slave.sql). Create a text file 1reset_master.sql and add the following line:
Create a text file 3setup_slave.sql and add the following lines: 5MASTER_DELAY=21600 is equal to 6 hours (in seconds). Then copy the backup file taken from our master (that has been transferred into our Docker host) to the “sql” directory and renamed it as 2mysqldump_complete.sql.gz: 6The final look of our “sql” directory should be something like this: 7Take note that we prefix the SQL filename with an integer to determine the execution order when Docker initializes the MySQL container. Once everything is in place, run the MySQL container for our 6-hour delayed slave: 8** The MYSQL_ROOT_PASSWORD value must be the same as the MySQL root password on the master. The following lines are what we are looking for to verify if MySQL is running correctly and connected as a slave to our master (192.168.55.171): 9You can then verify the replication status with following statement: 0At this point, our 5 minutes, 1-hour and 6-hour delayed slave containers are replicating correctly and our architecture is looking something like this: Disaster Recovery ScenarioLet’s say a user has accidentally dropped a wrong column on a big table. Consider the following statement was executed on the master: 1If you are lucky enough to realize it immediately, you could use the 15-minute delayed slave to catch up to the moment before the disaster happens and promote it to become master, or export the missing data out and restore it on the master. Firstly, we have to find the binary log position before the disaster happened. Grab the time now() on the master: 2Then, get the active binary log file on the master: 3Using the same date format, extract the information that we want from the binary log, binlog.000004. We estimate the start time to read from the binlog around 20 minutes ago (2018-12-04 14:35:00) and filter the output to show 25 lines before the “drop column” statement: 4In the bottom few lines of the mysqlbinlog output, you should have the erroneous command that was executed at position 19379556. The position that we should restore is one step before this, which is in position 19379491. This is the binlog position where we want our delayed slave to be up to. Then, on the chosen delayed slave, stop the delayed replication slave and start again the slave to a fixed end position that we figured out above: 5Monitor the replication status and wait until Exec_Master_Log_Pos is equal to Until_Log_Pos value. This could take some time. Once caught up, you should see the following: 6Finally verify if the missing data that we were looking for is there (column “status” still exists): 7Then export the table from our slave container and transfer it to the master server: 8Drop the problematic table and restore it back on the master: 9
We have now recovered our table back to its original state before the disastrous event. To summarize, delayed replication can be used for several purposes:
Final ThoughtsWith Docker, running multiple MySQL instances on a same physical host can be done efficiently. You may use Docker orchestration tools like Docker Compose and Swarm to simplify the multi-container deployment as opposed to the steps shown in this blog post. |