|    Home    |    Meetings    |    Wiki    |    Projects    |    Discuss    |    Reviews    |    Members    |   

Title: High Performance MySQL
Authors: Jeremy Zawodny & Derek Balling
Publisher: O'Reilly
Pages: 294 pages
Reviewer: Steven Lembark
Synopsis: Book would be good reading for anyone trying to manage a non-trivial LAMP site.
Table of Contents
MySQL is far less painful than most databases to manage and use. The good thing about this is that you can use it. The bad thing is that many people get rather far into using it well before they really understand how to use it properly -- by which time it's too late. There are any number of good introductory books available on MySQL which describe daily operations and SQL. The problem with introductions is that they do not describe how things really work, why you'd care, or what to do about it. This is especially true of performance issues, which usually involve a combination of internals, usage, and research to manage.

The authors of High Performance MySQL ask for "a little more Patience and time commitment than the average introductory computer book." They aren't kidding. Fortunately this is not another introductory book and really does describe enough internals, usage, and examples to help manage performance. They also describe it all with a readable style that avoided using up my patience on their text.

The chapters are broken down into five main sections, from basics through benchmarking and indexes, tuning, scalability, and replication and load-balancing. Most people dealing with day-to-day issues can probably get what they need from the first three (Chapters 1-6) -- load balancing, high-availability start to get esoteric for many sites with single servers.

The last two chapters on backups and security might seem out of place in a book on "performance". The authors do a good job of pointing out, however, that a down or corrupt database is not performing well, and that backups or tighter security can affect performance in themselves.

The Basics chapter is quite short and covers some minimal topics for configuring or viewing the database status. If this chapter doesn't fall well into the "obvious" category it will probably be worth taking time to read a real introduction before tackling the rest of this book.

Storage Engines is a nice combination of theory and practice, with examples specific-enough be useful. One of MySQL's strengths is handling multiple underlying data storage engines. Choosing between them is a source of confusion, however, since many users don't know enough about the choices to make good ones. This chapter does a good job clearing the mist by describing not only the engines themselves but baseline requirements for an "ACID" database, and isolation levels. Short examples of various uses include stock quotes, log summary, order processing, and bulletin boards. The examples are specific enough to illustrate differences in the storage engines without bogging down in detail.

Chapter three covers benchmarking, including summaries of three available tools for both benchmarking and stress- testing database configurations.

The chapters on indexes, query performance, and server performance will be useful even to experienced DBA's. The information includes index types and their foibles, how indexes are used in queries, cache management, the optimizer, identifying slow queries, hints, and operating system/server issues. Examples on how MySQL uses indexes, where they help/hurt performance, and workarounds will be useful to anyone trying to design a scalable MySQL database.

Chapters seven and eight describe replication and load balancing. They are well-written, but won't apply to directly managers on a with a single computer. They will be worth a skim for managers of growing sites, however, since there are good descriptions of when replication can help.

Backup and recovery is worth reading by anyone who runs a MySQL site. Even small(ish) sites can benefit from good backups -- or ones that don't interfere with normal operations. Various strategies for cold and hot backups are compared, along with decent descriptions of how to restore the databases.

Security is the last main chapter in the book and is more about operational issues than the others. Aside from some affects of encryption, this chapter is about keeping the database up and clean rather than making it run faster. There is a good description of Host Matching in the privilege system, the tables involved with security, and how to manage it. If nothing else, reading this will avoid sites granting all privileges by default in order to avoid figuring out which ones are necessary. The tradeoffs between user-level encryption and other options are also discussed.

The appendixes include the status commands, mytop, and a php-based administration tool. The analysis of MySQL and Innodb show statistics is helpful -- largely because the latter provides a huge amount of information and lacks a 'verbose' option to leave any of it off. In phpMyAdmin, the practical examples start to drift back into the usual "Introduction to..." level and probably could have had more on the performance data that phpMyAdmin includes.

One thing I'd have liked, given MySQL's common use, is any discussion of the various interfaces available to it and how well they perform in different situations. C, C++, DBI, ODBC, and JDBC interfaces are readily available for MySQL and -- from what I've seen -- vary enormously in their performance on the same database. Admittedly this falls more into a "programming" book, but many (most?) MySQL DBA's are managing LAMP systems and at least some discussion of the their language needs would have been helpful.

Overall, the book was detailed enough to be useful, readable enough to finish, contained examples that helped me at work, and was worth the patience to finish.