Use persistent connections to the database to avoid
connection overhead. If you cannot use persistent
connections and you are initiating many new connections to
the database, you may want to change the value of the
thread_cache_size variable. See
Section 7.5.2, “Tuning Server Parameters”.
Always check whether all your queries really use the indexes
that you have created in the tables. In MySQL, you can do
this with the
EXPLAIN statement. See
Section 7.2.1, “Optimizing Queries with
Try to avoid complex
SELECT queries on
MyISAM tables that are updated
frequently, to avoid problems with table locking that occur
due to contention between readers and writers.
MyISAM tables that have no deleted
rows in the middle, you can insert rows at the end at the
same time that another query is reading from the table. If
it is important to be able to do this, you should consider
using the table in ways that avoid deleting rows. Another
possibility is to run
OPTIMIZE TABLE to
defragment the table after you have deleted a lot of rows
from it. See Section 14.1, “The
MyISAM Storage Engine”.
To fix any compression issues that may have occurred with
ARCHIVE tables, you can use
OPTIMIZE TABLE. See
Section 14.8, “The
ARCHIVE Storage Engine”.
ALTER TABLE ... ORDER BY
usually retrieve rows in
using this option after extensive changes to the table, you
may be able to get higher performance.
In some cases, it may make sense to introduce a column that
is “hashed” based on information from other
columns. If this column is short and reasonably unique, it
may be much faster than a “wide” index on many
columns. In MySQL, it is very easy to use this extra column:
SELECT * FROM
MyISAM tables that change frequently,
you should try to avoid all variable-length columns
TEXT). The table uses dynamic row format
if it includes even a single variable-length column. See
Chapter 14, Storage Engines and Table Types.
It is normally not useful to split a table into different
tables just because the rows become large. In accessing a
row, the biggest performance hit is the disk seek needed to
find the first byte of the row. After finding the data, most
modern disks can read the entire row fast enough for most
applications. The only cases where splitting up a table
makes an appreciable difference is if it is a
MyISAM table using dynamic row format
that you can change to a fixed row size, or if you very
often need to scan the table but do not need most of the
columns. See Chapter 14, Storage Engines and Table Types.
If you often need to calculate results such as counts based
on information from a lot of rows, it may be preferable to
introduce a new table and update the counter in real time.
An update of the following form is very fast:
This is very important when you use MySQL storage engines
MyISAM that has only table-level
locking (multiple readers with single writers). This also
gives better performance with most database systems, because
the row locking manager in this case has less to do.
If you need to collect statistics from large log tables, use
summary tables instead of scanning the entire log table.
Maintaining the summaries should be much faster than trying
to calculate statistics “live.” Regenerating
new summary tables from the logs when things change
(depending on business decisions) is faster than changing
the running application.
If possible, you should classify reports as
“live” or as “statistical,” where
data needed for statistical reports is created only from
summary tables that are generated periodically from the live
Take advantage of the fact that columns have default values.
Insert values explicitly only when the value to be inserted
differs from the default. This reduces the parsing that
MySQL must do and improves the insert speed.
In some cases, it is convenient to pack and store data into
BLOB column. In this case, you must
provide code in your application to pack and unpack
information, but this may save a lot of accesses at some
stage. This is practical when you have data that does not
conform well to a rows-and-columns table structure.
Normally, you should try to keep all data non-redundant
(observing what is referred to in database theory as
third normal form). However, there
may be situations in which it can be advantageous to
duplicate information or create summary tables to gain more
Stored routines or UDFs (user-defined functions) may be a
good way to gain performance for some tasks. See
Chapter 19, Stored Procedures and Functions, and
Section 27.3, “Adding New Functions to MySQL”, for more information.
You can always gain something by caching queries or answers
in your application and then performing many inserts or
updates together. If your database system supports table
locks (as do MySQL and Oracle), this should help to ensure
that the index cache is only flushed once after all updates.
You can also take advantage of MySQL's query cache to
achieve similar results; see Section 5.13, “The MySQL Query Cache”.
INSERT DELAYED when you do not need
to know when your data is written. This reduces the overall
insertion impact because many rows can be written with a
single disk write.
INSERT LOW_PRIORITY when you want to
SELECT statements higher priority
than your inserts.
SELECT HIGH_PRIORITY to get
retrievals that jump the queue. That is, the
SELECT is executed even if there is
another client waiting to do a write.
INSERT statements to
store many rows with one SQL statement. Many SQL servers
support this, including MySQL.
LOAD DATA INFILE to load large
amounts of data. This is faster than using
AUTO_INCREMENT columns to generate
OPTIMIZE TABLE once in a while to
avoid fragmentation with dynamic-format
MyISAM tables. See
Section 14.1.3, “
MyISAM Table Storage Formats”.
MEMORY tables when possible to get
more speed. See Section 14.4, “The
HEAP) Storage Engine”.
MEMORY tables are useful for non-critical
data that is accessed often, such as information about the
last displayed banner for users who don't have cookies
enabled in their Web browser. User sessions are another
alternative available in many Web application environments
for handling volatile state data.
With Web servers, images and other binary assets should
normally be stored as files. That is, store only a reference
to the file rather than the file itself in the database.
Most Web servers are better at caching files than database
contents, so using files is generally faster.
Columns with identical information in different tables
should be declared to have identical data types so that
joins based on the corresponding columns will be faster.
Try to keep column names simple. For example, in a table
customer, use a column name of
name instead of
customer_name. To make your names
portable to other SQL servers, you should keep them shorter
than 18 characters.
If you need really high speed, you should take a look at the
low-level interfaces for data storage that the different SQL
servers support. For example, by accessing the MySQL
MyISAM storage engine directly, you could
get a speed increase of two to five times compared to using
the SQL interface. To be able to do this, the data must be
on the same server as the application, and usually it should
only be accessed by one process (because external file
locking is really slow). One could eliminate these problems
by introducing low-level
in the MySQL server (this could be one easy way to get more
performance if needed). By carefully designing the database
interface, it should be quite easy to support this type of
If you are using numerical data, it is faster in many cases
to access information from a database (using a live
connection) than to access a text file. Information in the
database is likely to be stored in a more compact format
than in the text file, so accessing it involves fewer disk
accesses. You also save code in your application because you
need not parse your text files to find line and column
Replication can provide a performance benefit for some
operations. You can distribute client retrievals among
replication servers to split up the load. To avoid slowing
down the master while making backups, you can make backups
using a slave server. See Chapter 6, Replication.
MyISAM table with the
DELAY_KEY_WRITE=1 table option makes
index updates faster because they are not flushed to disk
until the table is closed. The downside is that if something
kills the server while such a table is open, you should
ensure that the table is okay by running the server with the
--myisam-recover option, or by running
myisamchk before restarting the server.
(However, even in this case, you should not lose anything by
DELAY_KEY_WRITE, because the key
information can always be generated from the data rows.)