SQL Performance Tuning [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

SQL Performance Tuning [Electronic resources] - نسخه متنی

Peter Gulutzan, Trudy Pelzer

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید




Chapter 17. Cost-Based Optimizers


Supposedly, a rule-based optimizer (RBO) differs from a cost-based optimizer (CBO). Consider this SQL statement:


SELECT * FROM Table1
WHERE column2 = 55

Assume that column2 is an indexed, non-unique column. A rule-based optimizer will find column2 in the system catalog and discover that it is indexed, but not uniquely indexed. The RBO then combines this data with the information that the query uses the equals operator. A common assumption in the field of optimization is that "= <literal>" search conditions will retrieve 5% of all rows. (In contrast, the assumption for greater-than conditions is that they will retrieve 25% of all rows.) That is a narrow search, and usually it's faster to perform a narrow search with a B-tree rather than scan all rows in the table. Therefore the rule-based optimizer makes a plan: find matching rows using the index on column2.

Notice that the rule-based optimizer is using a non-volatile datum (the existence of an index) and a fixed assumption (that equals searches are narrow).

A cost-based optimizer can go further. Suppose the system catalog contains three additional pieces of information: (1) that there are 100 rows in Table1, (2) that there are two pages in Table1, and (3) that the value 55 appears 60 times in the index for column2. Those facts change everything. The equals operation will match on 60% of the rows, so it's not a narrow search. And the whole table can be scanned using two page reads, whereas an index lookup would take three page reads (one to lookup in the index, two more to fetch the data later). Therefore the cost-based optimizer makes a different plan: find matching rows using a table scan.

Notice that the cost-based optimizer is using volatile data (the row and column values that have been inserted) and an override (that the contents are more important than the fixed assumptions).

In other words, a cost-based optimizer is a rule-based optimizer that has additional, volatile information available to it so that it can override the fixed assumptions that would otherwise govern its decisions. The terminology causes an impression that one optimizer type is based on rules while the other is based on cost. That's unfortunate because both optimizer types use rules and both optimizer types have the goal of calculating cost. The reality is that cost-based is an extension of rule-based, and a better term would have been something like "rule-based++."

Most vendors claim that their DBMSs have cost-based optimizers, as you can see from Table 17-1. The claims don't mean much by themselves. What's important is whether the optimizer estimates cost correctly and how it acts on the estimate. In this chapter, we'll look at the actions DBMSs take to fulfill their claims.




























































Table 17-1. Cost-Based Optimizers
Claims to be CBO
"Explains" the Access Plan
"Updates" Statistics for the Optimizer
IBM
Yes
EXPLAIN
RUNSTATS
Informix
Yes
SET EXPLAIN
UPDATE STATISTICS
Ingres
Yes
EXECUTE QEP
optimizedb utility
InterBase
Yes
SELECT … PLAN
SET STATISTICS
Microsoft
Yes
EXPLAIN
UPDATE STATISTICS
MySQL
No
EXPLAIN
ANALYZE TABLE
Oracle
Yes
EXPLAIN PLAN FOR
ANALYZE
Sybase
Yes
SET SHOWPLAN ON
UPDATE STATISTICS

Notes on Table 17-1:


Claims to be CBO column

This column is "Yes" if the DBMS's documentation makes the claim that it operates with a cost-based optimizer.


"Explains" the Access Plan column

This column shows the non-standard statement provided by the DBMS so that you can examine the access plan the optimizer will use to resolve an SQL statement. For example, if you want to know how Oracle will resolve a specific SELECT statement, just execute an EXPLAIN PLAN FOR statement for the SELECT.


"Updates" Statistics for the Optimizer column

This column shows the non-standard statement or utility the DBMS provides so that you can update volatile information, or statistics, for the optimizer. For example, if your DBMS is MySQL and you've just added many rows to a table and want the optimizer to know about them, just execute an ANALYZE TABLE statement for the table.



/ 124