Chapter 14. Database Access - Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



Chapter 14. Database Access


Contents:

Introduction

Making and Using a DBM File

Emptying a DBM File

Converting Between DBM Files

Merging DBM Files

Sorting Large DBM Files

Storing Complex Data in a DBM File

Persistent Data

Saving Query Results to Excel or CSV

Executing an SQL Command Using DBI

Escaping Quotes

Dealing with Database Errors

Repeating Queries Efficiently

Building Queries Programmatically

Finding the Number of Rows Returned by a Query

Using Transactions

Viewing Data One Page at a Time

Querying a CSV File with SQL

Using SQL Without a Database Server

Program: ggh—Grep Netscape Global History

Charles Dickens, David Copperfield

I only ask for information.


14.0. Introduction


Everywhere you find data, you find
databases. At the simplest level, every file can be considered a
database. At the most complex level, expensive and complex relational
database systems handle thousands of transactions per second. In
between are countless improvised schemes for fast access to loosely
structured data. Perl can work with all of
them.

Early in the history of computers, people noticed that flat file
databases don't scale to large data sets. Flat files were tamed using
fixed-length records or auxiliary indices, but updating became
expensive, and previously simple applications bogged down with I/O
overhead.

After some head-scratching, clever programmers devised a better
solution. As hashes in memory provide more flexible access to data
than do arrays, hashes on disk offer more convenient kinds of access
than array-like text files. These benefits in access time cost you
space, but disk space is cheap these days (or so the reasoning goes).


The
DBM library gives Perl programmers a simple, easy-to-use database.
You use the same standard operations on hashes bound to DBM files as
you do on hashes in memory. In fact, that's how you use DBM databases
from Perl. You use tie to associate a hash with a
class and a file. Then whenever you access the hash, the class
consults or changes the DBM database on disk. The old
dbmopen function also did this, but only let you
use one DBM implementation in your program, so you couldn't copy from
one format to another.

Recipe 14.1 shows how to create a DBM
database and gives tips on using it efficiently. Although you can do
with DBM files the same things you do with regular hashes, their
disk-based nature leads to performance concerns that don't exist with
in-memory hashes. Because DBM files are disk-based and can be shared
between processors, use a sentinel lock file (see Recipe 7.24) to regulate concurrent access to them.Recipes
Recipe 14.2 and Recipe 14.4 explain
these concerns and show how to work around them. DBM files also make
possible operations that aren't available using regular hashes.
Recipe 14.5 explains two of these things.

Various DBM implementations offer varying features. Table 14-1 shows several possible DBM libraries you can choose from.

Table 14-1. DBM libraries and their features



















































































































Feature


NDBM


SDBM


GDBM


DB


Linkage comes with Perl


yes


yes


yes


yes


Source bundled with Perl


no


yes


no


no


Source redistributable


no


yes


gpl[25]


yes


FTPable


no


yes


yes


yes


Easy to build


N/A


yes


yes


ok[26]


Often comes with Unix


yes[27]


no


no[28]


no[28]


Builds okay on Unix


N/A


yes


yes


yes[29]


Builds okay on Windows


N/A


yes


yes


yes[30]


Code size


[31]


small


big


big[32]


Disk usage


[31]


small


big


ok


Speed


[31]


slow


ok


fast


Block size limits


4k


1k[33]


none


none


Byte-order independent


no


no


no


yes


User-defined sort order


no


no


no


yes


Partial key lookups


no


no


no


yes

[25]Using GPLed code in your program places
restrictions upon you. See
http://www.gnu.org for more details.

[26]See the DB_File library method. Requires symbolic
links.

[27]On mixed-universe machines, this may be in the
BSD compatibility library, which is often shunned.

[28]Except for free Unix ports such as Linux, FreeBSD,
versions of Perl on Windows systems were widely available, including
the standard port build from the normal Perl distribution and several
proprietary ports. Like most CPAN modules, DB builds only on the
standard port.

[31]Depends on how much your vendor has tweaked
it.

[32]Can be reduced if you compile for one access
of compatibility with older files).




NDBM
comes with most BSD-derived machines. GDBM is a GNU DBM
implementation. SDBM is part of the X11 distribution and also the
standard Perl source distribution. DB refers to the Berkeley DB
library. While the others are essentially reimplementations of the
original DB library, the Berkeley DB code gives you three different
types of database on disk and attempts to solve many of the disk,
speed, and size limitations that hinder the other implementations.


Code
size refers to the size of the compiled libraries. Disk usage refers
to the size of the database files it creates. Block size limits refer
to the database's maximum key or value size. Byte-order independence
refers to whether the database system relies on hardware byte order
or whether it instead creates portable files. A user-defined sort
order lets you tell the library in what order to return lists of
keys. Partial key lookups let you make approximate searches on the
database.

Most Perl programmers prefer the Berkeley DB implementations. Many
systems already have this library installed, and Perl can use it. For
others, you are advised to fetch and install it from CPAN. It will
make your life much easier.

DBM files provide key/value pairs. In relational database terms, you
get a database with one table that has only two columns.
Recipe 14.6 shows you how to use the MLDBM module from
CPAN to store arbitrarily complex data structures in a DBM
file.

As good as MLDBM is, it doesn't get around the limitation that you
only retrieve rows based on one single column, the hash key. If you
need complex queries, the difficulties can be overwhelming. In these
cases, consider a separate database management system (DBMS). The DBI
project provides modules to work with Oracle, Sybase, mSQL, MySQL,
Ingres, and others.

An
interesting medium between a full relational database server and a
DBM file is the DBD::SQLite module. This provides an SQL interface to
a relational database, but without a server process—the module
reads and writes the single file that contains all your tables. This
gives you the power of SQL and multiple tables without the
inconvenience of RDBMS administration. A benefit of manipulating
tables from the one process is a considerable gain in speed.

See http://dbi.perl.org/doc/1l and
http://search.cpan.org/modlist/Database_Interfaces.
DBI supports most major and minor databases, including Oracle, ODBC,
Sybase, Informix, MySQL, PostgreSQL, and XBase. There are also DBD
interfaces to data sources such as SQLite, Excel files, and CSV
files.



13.15. Creating Magic Variables with tie14.1. Making and Using a DBM File




Copyright © 2003 O'Reilly & Associates. All rights reserved.

/ 875