Jump to Navigation

MySQL: Binlogs

You have to use binlogs for three reasons: when this server is a master to a slave, when this server is part of a replication structure using GTIDs or when you want to be able to do point-in-time recoveries.

A slave uses the binlogs of the master for replication. If you are not using GTIDs, the master binlogs are read sequentially and each non-filtered transaction is applied. The position in the master binlogs of the slave is thus very important.

When using GTIDs, basically the same happens. But in this case, each transaction gets a uniquely identified (a Global Transaction ID). Any GTID on the master that is not on the slave and that is not filtered is applied on the slave.

Finally, using the binlogs, a server can be rolled forward from a given point in time to another point in time. For example, when a server crashes, you can read in the last backup and then apply all binlogs starting at the time of backup up to the moment of the crash.

Filtering binlogs

In a master slave relation, you can filter the transaction to replicate on both the master and the slave. Filtering them on the master means that specific transaction are not written to the binlogs. This can be a bad idea if you want to be able to do point in time recovery of the master server. Usually, filtering on the slave side is a much better idea. The only downside is that transactions are sent to the slave that will not be applied after all.

Settings to filter on the slave side include replicate-do-db, replicate-ignore-db, replicate-do-table, replicate-ignore-table, replicate-wild-do-table and replicate-wild-ignore-table. Filtering can thus be done on a database and on a table level. Rule order is as in this list: first a check on database level is done, then (if the database is matched) on table level, with in each case the first match determining what happens.

Using statement based replication, unexpected things may happen using filtering. The database which is tested again is the default database (the last USE statement). This may not be the database which is changed by the statement, resulting in transaction missing, unexpected transactions being applied or replication failing because objects can not be found. Also, if table filtering is used, the whole statement is either applied or not, which can also be unexpected.

Using row based replication, things get a whole lot more predictable, as the database and table tested again is always the one being modified.

Using GTIDs, there is always one effective filter: if a transaction is already registered, it will never be applied again.

Starting replication

When you start replication (whether this is a new slave server or an existing one) the following happens.

If you are not using GTIDs, replication starts from a specific position in the master binlogs. On an existing slave, that is the position where replication stopped last time. On a new server, this is the first still available master binlog. In both cases, you can with a CHANGE MASTER TO MASTER_LOG_FILE=file, MASTER_LOG_POS=pos statement change this.

If you are using GTIDs, the master binlogs are read from the start and all transactions not registered on the slave are replicated. This mode is indicated using the CHANGE MASTER TO MASTER_AUTO_POSITION=1 statement.

Point in time recovery

To be able to do a point in time recovery (also called roll forward), you have to have the binlogs enabled. You should usually not filter the binlogs (filtering on the slave side is OK). Finally, you have to enable log-slave-updates on slave servers, in order to write the replication data to the binlog.

Tags:


Technical_article | by Dr. Radut