Database Legacy
Flat file, hierarchy, and network databasesare usually referred as legacy databases. They represent
the ways people used to organize information in prehistoric times — about 30
years ago.
Flat file databases
The flat file database was probably one of
the earliest database management systems. The idea behind flat file is a very
simple one: one single, mostly unstructured data file. It mirrors "ancient"
precomputer data storage systems: notebooks, grocery lists, and so on. You
could compare it to a desk drawer that holds virtually everything — bill stubs,
letters, small change. While requiring very little effort to put information
in, such a "design" becomes a nightmare to get the
information out, as you would have to scroll through each
and every record searching for the right one. Putting relevant data into
separate files and even organizing them into tables (think of a file cabinet)
alleviates the problem somewhat but does not remove the major obstacles: data
redundancy (the same information might be stored more than once in different
files), slow processing speed ("I know it was there somewhere..."), error-prone
storage and retrieval. Moreover, it required intimate knowledge of the database
structure to work at all — it would be utterly useless to search for, say,
orders information in the expenses file. Let's design a flat database system for an
order entry system that gathers information about customers, orders they've
placed and products the customers had ordered. If data is accumulated
sequentially, your file will contain information about customers, then orders
and products, then about some new customer, and so on — all in the order the
data is entered (Table
1-2). Just imagine a task of extracting any meaningful information from
this mess, not to mention that a lot of the cells will remain empty. (What
would you fill Quantity column for the "Ace Hardware" or Address column for
"Nails" with?)
Name | Type | Address | Price | Quantity |
---|---|---|---|---|
Nails | Product | n/a | 100 | 2000 |
Ace Hardware | Customer | 1234 Willow Ct Seattle, Washington | n/a | n/a |
Cedar planks | Product | n/a | 2000 | 5000 |
Dissatisfaction with these shortcomings
stimulated development in the area of data storage-and-retrieval
systems.
Note | Excel is often used to create flat file databases. |
Hierarchical databases
The concept of a hierarchical database was
around since the 1960s and — believe it or not — it is still in use. The
hierarchical model is fairly intuitive: As the name implies, it stores data in
hierarchical structure, similar to that of a family tree, organization chart,
or pyramid; some readers could visualize a computer file system as it is
presented through some graphical interface.The most popular hierarchical database
product is IBM's Information Management System (IMS) that runs on mainframe
computers. First introduced in 1968, it is still around (after a number of
reincarnations), primarily because hierarchical databases provide impressive
raw speed performance for certain types of queries.It is based on "parent/child" paradigm in
which each parent could have many children but each child has one and only one
parent. You can visualize this structure as an upside down tree, starting at
the root (trunk) and branching out at many levels (Figure 1-1).

Figure 1-1: Hierarchical structure
Since the records in a child table are
accessed through a hierarchy of levels there could not be a record in it
without a corresponding pointer record in the parent table — all the way up to
the root. You could compare it to a file management system (like a tree-view
seen in the Microsoft Windows Explorer) — to get access to a file within a
directory one must first open the folder that contains this file.Let's improve upon the previously
discussed flat file model. Instead of dumping all the information into a single
file you are going to split it among three tables, each containing pertinent
information: business name and address for the
CUSTOMER table; product description,
brand name, and price for the
PRODUCT table; and an
ORDER_HEADER table to store the details
of the order.In the hierarchical database model
redundancy is greatly reduced (compared with flat file database model): You
store information about customer, product, and so on once only. The table
ORDER_HEADER (Figure 1-2) would
contain pointers to the customer and to the product this customer had ordered;
whenever you need to see what products any particular customer purchased, you
start with
ORDER_HEADER table, find list of id(s)
for all the customers who placed orders and list of product id(s) for each
customer; then, using
CUSTOMER table you find the customer name
you are after, and using products id(s) list you get the description of the
products from the
PRODUCT table.

Figure 1-2: Hierarchical database
example
Everything works great as long as one is
willing to put up with a somewhat nonintuitive way of retrieving information.
(No matter what information is requested one always has to start with the root,
i.e.,
ORDER_HEADER table.) Should you need only
customers' names the hierarchical database would be blazingly fast — going
straight from a parent table to the child one. To get any information from the
hierarchical database a user has to have an intimate knowledge of the database
structure; and the structure itself was extremely inflexible — if, for
instance, you'd decided that the customers must place an order through a third
party, you'd need to rewire all relationships because
CUSTOMER table would not be related to
ORDER_HEADER table anymore, and all your
queries will have to be rewritten to include one more step — finding the sales
agent who sold this product, then finding customers who bought it. It also
makes obvious the fact that you did not escape the redundancy problem — if you
have a customer who places an order through more than one sales agent, you'll
have to replicate all the information for each agent in a number of customer
tables.
But what happens if you need to add a
customer that does not have a placed order, or a product that no one yet
ordered? You cannot — your hierarchical database is incapable of storing
information in child tables without a parent table having a pointer to it: by
the very definition of hierarchy there should be neither a product without an
order, nor a customer without an order — which obviously cannot be the case in
the real world. The hierarchical databases handle
one-to-many relationship (see
Chapter
2 for definition) very well. However, in many cases you will want to
have the child be related to more than one parent: Not only one product could
be present in many orders, but one order could contain many products. There is
no answer (at least not an easy one) within the domain of hierarchical
databases.
Network databases
Attempts to solve the problems associated
with hierarchical databases produced the network database
model. This model has its origins in the Conference on Data Systems Languages
(CODASYL), an organization founded in 1957 by the U.S. Department of Defense.
CODASYL was responsible for developing COBOL — one of the first widely popular
programming languages — and publishing the Network Database standard in 1971
The most popular commercial implementation of the network model was Adabas
(long since converted to the relational model).The network model is very similar to the
hierarchical one; it is also based on the concept of parent/child relationship
but removes the restriction of one child having one and only one parent. In the
network database model a parent can have multiple children, and a child can
have multiple parents. This structure could be visualized as several trees that
share some branches. In network database jargon these relationships came to be
known as sets. In addition to the ability to handle a
one-to-many relationship, the network database can handle many-to-many
relationships.
Cross-References | One-to-one, one-to-many, and many-to-many relationships are explained in Chapter 2. |
Also, data access did not have to begin
with the root; instead one could traverse the database structure starting from
any table and navigating a related table in any direction (Figure 1-3).

Figure 1-3: Network database
example
In this example, to find out what products
were sold to what customers we still would have to start with
ORDER_HEADER and then proceed to
CUSTOMER and
PRODUCT — nothing new here. But things
greatly improve for the scenario when customers place an order through more
than one agent: no longer does one have to go through agents to list customers
of the specific product, and no longer has one to start at the root in search
of records.While providing several advantages,
network databases share several problems with hierarchical databases. Both are
very inflexible, and changes in the structure (for example, a new table to
reflect changed business logic) require that the entire database be rebuilt;
also, set relationships and record structures must be predefined.The major disadvantage of both network and
hierarchical database was that they are programmers' domains. To answer the
simplest query, one had to create a program that navigated database structure
and produced an output; unlike SQL this program was written in procedural,
often proprietary, language and required a great deal of knowledge — of both
database structure and underlying operating system. As a result, such programs
were not portable and took enormous (by today's standards) amount of time to
write.