High Performance MySQL [Electronic resources] نسخه متنی

4.3 Indexes and Table Types

Now that we have discussed the
common index types, terminology, and uses in relatively generic terms
so far, lets look at the indexes implemented in
each of MySQLs storage engines. Each engine
implements a subset of the three index types weve
looked at. They also provide different optimizations that you should
be aware of.

4.3.1 MyISAM Tables

MySQLs default table
type provides B-tree indexes, and as of Version 4.1.0, it provides
R-tree indexes for spatial data. In addition to the standard benefits
that come with a good B-tree implementation, MyISAM adds two other
important but relatively unknown features prefix compression and
packed keys.

Prefix
compression is used to factor out common prefixes in string keys. In
a table that stores URLs, it would be a waste of space for MySQL to
store the "http://" in every node
of the B-tree. Because it is common to large number of the keys, it
will compress the common prefix so that it takes significantly less
space.

Packed keys are best thought of as prefix compression for integer
keys. Because integer keys are stored with their high bytes first,
its common for a large group of keys to share a
common prefix because the highest bits of the number change far less
often. To enable packed keys, simply append:

PACKED_KEY = 1

to the CREATE TABLE statement.

MySQL stores the indexes for a table in the tables
.MYI file.

4.3.1.1 Delayed key writes

One performance-enhancing feature of
MyISAM tables is the ability to delay the writing of index data to
disk. Normally, MySQL will flush modified key blocks to disk
immediately after making changes to them, but you can override this
behavior on a per-table basis or globally. Doing so provides a
significant performance boost during heavy INSERT,
UPDATE, and DELETE activity.

MySQLs delay_key_write tristate
setting controls this behavior. The default, ON,
means that MySQL will honor the DELAY_KEY_WRITE
option in CREATE TABLE. Setting
it to OFF means that MySQL will never delay key
writes. And setting it to ALL tells MySQL to delay
key writes on all MyISAM tables regardless of the
DELAY_KEY_WRITE used when the table was created.

The downside of delayed key writes is that the indexes may be out of
sync with the data if MySQL crashes and has unwritten data in its key
buffer. A REPAIR TABLE, which
rebuilds all indexes and may consume a lot of time, is necessary to
correct the problem.

4.3.2 Heap Tables

MySQLs only in-memory
table type was originally built with support just for hash indexes.
As of Version 4.1.0, however, you may choose between B-tree and hash
indexes in Heap tables. The default is still to use a hash index, but
specifying B-tree is simple:

mysql> create table heap_test (
-> name varchar(50) not null,
-> index using btree (name)
-> ) type = HEAP;
Query OK, 0 rows affected (0.00 sec)

To verify that the index was created properly, use the SHOW
KEYS command:

mysql> show keys from heap_test \G
*************************** 1. row ***************************
Table: heap_test
Non_unique: 1
Key_name: name
Seq_in_index: 1
Column_name: name
Collation: A
Cardinality: NULL
Sub_part: NULL
Packed: NULL
Null: 
Index_type: BTREE
Comment: 
1 row in set (0.00 sec)

By combining the flexibility of B-tree indexes and the raw speed of
an in-memory table, query performance of the temp tables is hard to
beat. Of course, if all you need are fast single-key lookups, the
default hash indexes in Heap tables will serve you well. They are
lightning fast and very space efficient.

The index data for Heap tables is always stored in memoryjust
like the data.

4.3.3 BDB Tables

MySQLs
Berkeley DB
(BDB) tables provide only B-tree indexes. This may come as a surprise
to long-time BDB users who may be familiar with its underlying
hash-based indexes. The indexes are stored in the same file as the
data itself.

BDBs indexes, like those in MyISAM, also provide
prefix compression. Like InnoDB, BDB also uses clustered indexes, and
BDB tables require a primary key. If you dont
supply one, MySQL creates a hidden primary key it uses internally for
locating rows. The requirement exists because BDB always uses the
primary key to locate rows. Index entries always refer to rows using
the primary key rather than the records physical
location. This means that record lookups on secondary indexes are
slightly slower then primary-key lookups.

4.3.4 InnoDB Tables

InnoDB tables provide B-tree indexes.
The indexes provide no packing or prefix compression. In addition,
InnoDB also requires a primary key for each table. As with BDB,
though, if you dont provide a primary key, MySQL
will supply a 64-bit value for you.

The indexes are stored in the InnoDB tablespace, just like the data
and data dictionary (table definitions, etc.). Furthermore, InnoDB
uses clustered indexes. That is, the primary keys
value directly affects the physical location of the row as well as
its corresponding index node. Because of this, lookups based on
primary key in InnoDB are very fast. Once the index node is found,
the relevant records are likely to already be cached in
InnoDBs buffer pool.

4.3.5 Full-Text Indexes

A full-text index is a special type of
index that can quickly retrieve the locations of every distinct word
in a field. MySQLs provides full-text indexing
support in MyISAM tables. Full-text indexes are built against one or
more text fields (VARCHAR,
TEXT, etc.) in a table.

The full-text index is also stored in a tables
.MYI file. It is implemented by creating a
normal two-part MyISAM B-tree index in which the first field is a
VARCHAR, and the second is a
FLOAT. The first field contains the indexed word,
and the FLOAT is its local weight in the row.

Because they generally contain one record for each word in each
indexed field, full-text indexes can get large rather quickly.
Luckily, MySQLs B-tree indexes are quite efficient,
so space consumed by full-text is well worth the performance boost.

Its not uncommon for a query like:

select * from articles where body = "%database%"

to run thousands of times faster when a full-text index is added and
the query is re-written as:

select * from articles (body) match against (database)

As with all index types, its a matter of trading
space for speed.

4.3.6 Index Limitations

There
are many times when MySQL simply cant use an index
to satisfy a query. To help you recognize these limitations (and
hopefully avoid them), lets look at the four main
impediments to using an index.

4.3.6.1 Wildcard matches

A query to locate all records that
contain the word "buffy":

select * from pages where page_text like "%buffy%"

is bound to be slow. It requires MySQL to scan every row in the
table. And it wont even find all occurrences,
because "buffy" may be followed by
some form of punctuation. The solution, of course, is to build a
full-text index on the page_text field and query
using MySQLs MATCH
AGAINST syntax.

When youre dealing with partial words, however,
things degenerate quickly. Imagine trying to find the phone number
for everyone whose last name contains the string
"son", such as Johnson, Ansona, or
Bronson. That query would look like this:

select phone_number from phone_book where last_name like "%son%"

That seems suspiciously similar to the
"buffy" example, and it is. Because
you are performing a wildcard search on the field, MySQL will need to
read every row, but switching to a full-text index
wont help. Full-text indexes deal with complete
words, so theyre of no help in this situation.

If thats surprising, consider how
youd attempt to locate all those names in a normal
phone book. Can you think of an efficient approach?
Theres really no simple change that can be made to
the printed phone book that will facilitate this type of query.

4.3.6.2 Regular expressions

Using a regular expression has
similar problems. Imagine trying to find all last names that end with
either "ith," such as Smith, or
"son" as in Johnson. As any Perl
hacker would tell you, thats easy. Build a regular
expression that looks something like (son|ith)$.

Translating that into MySQL, you might write this query:

select last_name from phone_book where last_name rlike "(son|ith)$"

However, youd find that it runs slowly, and it does
so for the same reasons that wildcard searches are slow.
Theres simply no generalized and efficient way to
build an index that facilitates running arbitrary wildcard or
regular-expression searches.

In this specific case, you can work around this limitation by storing
reversed last names in a second field. Then you can reverse the sense
of the search and use a query like this:

select last_name from phone_book where rev_last_name like "thi%"
union
select last_name from phone_book where rev_last_name like "nos%"

But thats efficient only because
youre starting at the beginning of the string,
which is really the end of the real string before it is reversed.
Again, theres no general solution to this problem.

Note that a regular expression still isnt efficient
in this case. You might be tempted to write this query:

select last_name from phone_book where rev_last_name rlike "^(thi|nos)"

You would be disappointed by its performance. The MySQL optimizer
simply never tries to optimize regex-based queries.

4.3.6.3 Poor statistics or corruption

If MySQLs internal
index statistics become corrupted or otherwise incorrect (possibly as
the result of a crash or accidental server shutdown), MySQL may begin
to exhibit very strange behavior. If the statistics are simply wrong,
you may find that it no longer uses an index for your query. Or it
may use an index only some of the time.

Whats likely happened is that MySQL believes that
the number of rows that match your query is so high that it would
actually be more efficient to perform a full table scan. Because
table scans are primarily sequential reads, theyre
faster than reading a large percentage of the records using an index,
which requires far more disk seeks.

If this happens (or you suspect it has), try the index repair and
analysis commands explained in the "Index
Maintenance" section later in this chapter.

4.3.6.4 Too many matching rows

Similarly, if a table
actually does have too many rows that really do match your query,
performance can be quite slow. How many rows are too many for MySQL?
It depends. But a good rule of thumb is that when MySQL believes more
than about 30% of the rows are likely matches, it will resort to a
table scan rather than using the index. There are a few exceptions to
this rule. Youll find a more detailed discussion of
this problem in Chapter 5.

High Performance MySQL [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی