SQL Bible [Electronic resources] نسخه متنی

Relational Databases

The frustration with the inadequate
capabilities of network and hierarchical databases resulted in the invention of
the relational data model. The relational data model took
the idea of the network database some several steps further. Relational models
— just like hierarchical and network models — are based upon tables and use
parent/child relationships. (Though this relationship was implemented through
column values as opposed to a low-level physical pointer defining the
relationship; more on that later in the chapter.)

Tables

A table is a basic building unit of the
relational database. It is a fairly intuitive way of organizing data and has
been around for centuries. A table consists of rows and columns (called
records and fields in database
jargon). Each table has a unique name in the database
(i.e., unique fully qualified name, the one that includes
schema or database name as a prefix).

Note

The Dot (.) notation in a fully qualified name is commonly used
in the programming world to describe hierarchy of the objects and their
properties. This could refer not only to the database objects but also to the
structures, user-defined types, and such. For example, a table field in an MS
SQL Server database could be referred to as ACME.DBO.CUSTOMER. CUST_ID_N where
ACME is a database name, DBO is the table owner (Microsoft standard), CUSTOMER
is the name of the table, and CUST_ID_N is the column name in the CUSTOMER
table.

Cross-References

See
Chapter
4 for more on table and other database object names.

Each field has a unique name within the
table, and any table must have at least one field. The number of fields per
table is usually limited, the actual limitation being dependent on a particular
implementation. Unlike legacy database structure, records in a table are not
stored or retrieved in any particular order (although, records can be arranged
in a particular order by means of using clustered index —
discussed in
Chapter
4); the task of sorting the record in relational databases systems
(RDBMS) is relegated to SQL.

A record thus is composed of a number of
cells, where each cell has a unique name and might contain some data. A table
that has no records is called an empty table.

Data within the field must be of the same
type, for example, the field
AMOUNT contains only numbers, and field
DESCRIPTION, only words. The set of the
data within one field is said to be column's domain.

Note

Early databases — relational or otherwise — were designed to
contain only text data; modern databases store anything that could be converted
into binary format: pictures, movies, audio records, and so on.

The good relational design would make
sure that such a record describes an entity — another
relational database term to be discussed later in the book but worth mentioning
here. To put it in other words, the record should not contain irrelevant
information:
CUSTOMER table deals with the customer
information only, its records should not contain information about, say,
products that this customer ordered.

Note

The process of grouping the relevant data together, eliminating
redundancies along the way is called normalization and
will be discussed in
Chapter
2. It is not part of SQL per se, but it does impose limits on the SQL
query efficiency.

There is no theoretical limit on the
number of rows a table could have, though some implementations impose
restrictions; also there are (or at least ought to be) practical considerations
to the limits: data retrieval speed, amount of storage, and so on.

Relationships

Tables in
RDBMS might or might not be related. As it was mentioned before, RDBMS is built
upon parent/child relationship notion (hence the name —
relational), but unlike"in legacy databases (hierarchical,
network) these relations are based solely on the values in the table columns;
these relationships are meaningful in logical terms, not in low-level computer
specific pointers. Let's take the example of our fictitious order entry
database (the one that we will design, build, and use throughout the book). The
ORDER_HEADER table is related to
CUSTOMER table since both of these tables
have a common set of values: The field
ORDHDR_CUSTID_FN (customer ID) in
ORDER_HEADER (and its values) corresponds
to
CUST_ID_N in
CUSTOMER. The field
CUST_ID_N is said to be a
primary key for the
CUSTOMER table and a foreign
key for the
ORDER_HEADER table (under different
name).

Primary key

The primary key
holds more than one job in RDBMS. We've said already that it is used
to define a relationship; but its primary role is to uniquely identify each
record in a table.

In the days of legacy databases, the
records were always stored in some predefined order; if such an order had to be
broken (because somebody had inserted records in a wrong order or business rule
was changed), then the whole table (and, most likely, the whole database) had
to be rebuilt. The RDBMS abolishes fixed order for the records, but it still
needs some mechanism of identifying the records uniquely, and the primary key,
based on the idea of a field (or fields) that contains set unique values,
serves exactly this purpose.

By it is very nature, the primary key
cannot be empty; this means that in a table with defined primary key, the
primary key fields must contain data for each record.

Note

Though it is not a requirement to have a primary key on each and
every table, it is considered to be a good practice to have one; in fact, many
RDBMS implementations would warn you if you create a table without defining a
primary key. Some purists go even further, specifying that the primary key
should be meaningless in the sense that they would use
some generated unique value (like
EMPLOYEE_ID) instead of, say, Social
Security numbers (despite that these are unique as well).

A primary key could consist of one or
more columns, i.e., though some fields may contain duplicate values, their
combination (set) is unique through the entire table. A key that consists of
several columns is called a composite key.

Note

In the world of RDBMS, only tables that have primary keys can be
related. Though the primary key is a cornerstone for defining relation in
RDBMS, the actual implementations (especially early ones) have not always
provided a built-in support for this logical concept. In practice, the task of
enforcing uniqueness of a chosen primary key was the responsibility of
programmers (requiring them to check for existing values before inserting new
records, for example). Today all major relational database products have
built-in support for primary keys; on a very basic level this means that the
database does its own checking for unique constraint violations and will raise
an error whenever an attempt to insert a duplicate record is made.

Foreign key

Let's go back to our
CUSTOMER and
ORDER_HEADER tables. By now you
understand why the
CUST_ID_N was designated as a primary key
— it has unique value, no customer can possibly have more than one ID, and no
ID could be assigned to more than one customer. To track what customers placed
which orders, you need something that will provide a link between customers and
their orders.

Table
ORDER_HEADER has its own primary key —
ORDHDR_ID_N which uniquely identifies
orders; in addition to that it will have a foreign key
ORDHDR_CUSTID_FN field. The values in
that field correspond to the values in the
CUST_ID_N primary key field for the
CUSTOMER table. Note that, unlike the
primary key, the foreign key is not required to be unique — one customer could
place several orders.

Now, by looking into
ORDER_HEADER table you can find which
customers placed particular orders. The table
ORDER_HEADER became related to table
CUSTOMER. It became easy to find a
customer based on orders, or find orders for a customer. You no longer need to
know database layout, order of the records in the table, or master some
low-level proprietary programming language to query data; it's now possible to
run ad-hoc queries formulated in standard English-like language — the
Structured Query Language.

Invasion of RDBMS

In spite of the clear advantages of the
relational database model, it took some time for it to become workable. One of
the main reasons was the hardware. The logically clear and clean model proved
to be quite a task to implement, and even then it required much more in terms
of memory and processing power than legacy databases.

The development of relational databases
was driven by the need of the medium to big businesses to gather, preserve, and
analyze data. In 1965, Gordon Moore, the cofounder of Intel, made his famous
observation that the number of transistors per square inch on the integrated
circuits (IC) doubles every year ever since the IC were invented. Surprisingly,
this rule still holds true. More powerful machines made it feasible to
implement and sell RDBMS; cheap memory and powerful processors made them fast;
perpetually growing appetites for information made RDBMS products a commodity,
drastically cutting their price down. Today, according to some estimates, less
than 10 percent of the market is being held by the database legacy "dinosaurs"
— mostly because of significant investment made by their owners more than 20
years ago. For better or for worse, relational database systems have come to
rule on planet Earth.

SQL Bible [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی