Coping With MySQL Slave Lag

DISCLAIMER: I do sysadmin work out of necessity, not by choice. Your mileage with these suggestions may vary. As always, exercise caution when making changes to your application’s architecture.

Background

When scaling your MySQL-based web application, setting up multiple databases in a master-slave configuration can be a great way to distribute load. The procedure is well documented, and fairly straightforward.

  • Any queries that change data (INSERT, UPDATE, ALTER, etc) are performed on the master
  • These changes are replicated the slave server(s)
  • Read queries can be performed on both the master and slave servers

The real advantage here is when a write query on the master locks a table, data from that table can still be read from a slave. WIN!

So once you have your master-slave setup running, you may notice that the slave servers periodically fall behind the master server. Usually, the lag is just a few seconds, but under certain circumstances, slave servers can lag dangerously behind (my personal best is just under one hour). Why? On the master server, multiple queries can be executed in parallel, but when the slave server reads queries from the binary log, they must be executed one at a time. While this may seem like a huge waste, it’s really the key to the replication sceme as it ensures that data on the slaves will exactly match that on the server once all of the queries have been executed.

Preventing Minimizing Lag & Being Smart About Laggy Data

  1. Optimize your queries - Because the slaves can’t replicate queries in parallel, this is your primary target. Break up large INSERT and UPDATE queries into several queries that deal with smaller data sets. And as an added bonus, optimized queries will make your entire application faster! </obvious>
  2. Beef up your slave servers - Conventional logic may suggest that your master server should be your most powerful and that your slave servers can rely on less stellar hardware. However, your master will always be able to process queries more efficiently than the slaves. Once again, it’s because slaves execute each query sequentially. (Notice a pattern?) Chances are, disk IO is going to be your bottleneck. Consider investing in faster hard disks in a RAID configuration.
  3. Give lagging slaves a break - Throwing queries at a lagging slave is only going to make matters worse. Not only will it be less efficient when churning through the binary log, but you’re also much more likely to be working with outdated data set. Incorporate logic into your application to distribute load to your slaves intelligently based on their individual lag (don’t forget, you can always run SELECT queries on the master).
  4. Memcache your writes - When your slaves are lagging, you run the risk of accessing old data. A great way to minimize this risk is to cache all of your most recently written data in some kind of caching layer, I like Memcached. This can be tricky to do depending on how your application interfaces with your database, but if table rows are discretely representated inside your application (IE - rows translate to objects), and you can isolate changes to these individual rows, caching that data can be a great way to prevent inconsistency and improve performance.

Additional Resources

Here are some outstanding resources with even more suggestions and detailed instructions on how to combat a stressfully high Seconds_Behind_Master.

1 month ago | Tags: mysql sysadmin databases programming

Comments