Ok, so you have your server configured at optimal performance, a great caching strategy, and your code has been refactored into its best working order. Now what? Time to take a dive into the slowest part of the web server chain: the database.
Breaking up our tables to reduce and eliminate data redundancy not only helps reduce the size of data storage, but can also increase performance. As well, it can also decrease performance in many situations, and at times developers have had to denormalize in order recover better read times. Your mileage may vary.
Normalization is typically broken into 5 categories. The Wikipedia links do a decent job of explaining each one:
- 1NF (First normal form): No repeating groups of data
- 2NF (Second normal form): 1NF, plus all of it's non-key attributes are fully dependent on its
- 3NF (Third normal form): 2NF, plus all of its attributes are directly dependent on the
- 4NF (Fourth Normal Form): 3NF, plus it must not contain more than one multivalued dependency.
- 5NF (Fifth Normal Form): A table is 5NF if and only if every non-trivial join dependency in it is implied by the candidate keys.
This seems like a no-brainer, but proper indexing is pretty much rule #1 for optimizing queries. Focus on rows that match conditions on a
WHERE clause. Don't create an unnecessary index, as this wastes space and time for MySQL in determining which index to use, as well as time costs with inserts, updates and deletes.
INDEX mean the same thing in MySQL.
UNIQUE are similar in nature:
PRIMARYis a column or a set of columns that can identify a
ROW. It is a special
UNIQUEkey. You can only have one of these per table.
- There can be many
UNIQUEindexes. An example would be a mobile phone number broken up over several columns, such as +
Joins vs Sub Selects
For those of us who used MySQL before the
OUTER JOIN existed, there were subqueries to save the day. According to MySQL's documentation, the
LEFT [OUTER] JOIN can be faster than an equivalent subquery because the server might be able to optimize it better.
While there are more than two storage engines with MySQL and now MariaDB, the two most commonly used ones are MyISAM and InnoDB. A somewhat dated review by Tag1 Consulting titled "MySQL Engines: MyISAM vs. InnoDB" explains a couple of the key issues most DBAs consider important (such as reliability).
MariaDB provides XtraDB, a performance enhanced fork of InnoDB, as well as many other storage engines.
BENCHMARK() function can be used on its client to repeatedly execute an expression a set number of times to measure elapsed time on the client end.
mysqlslap is a diagnostic program used to emulate client load.
For web-based services, ab - Apache HTTP server benchmarking tool can be used to measure database performance that is included in overall web server load.
EXPLAIN EXTENDED statements can be used in conjunction with a
SELECT query to get information from the optimizer about how tables are joined, in what order, etc.
ANALYZE TABLE should be run when there's a question as to whether specific indexes are being properly used.
We can set up server variables to profile queries without indexes, slow queries, long query times and more. Cloud hosting provider DigitalOcean has an article explaining how to set the MySQL query profiler up.
If you are doing full table scans and are managing tables that are larger than 100GB in size, partitioning tables is highly recommended from MySQL.