Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید

14.5. Sorting Large DBM Files


14.5.1. Problem



You want to
process a large dataset you'd like to commit to a DBM file in a
particular order.

14.5.2. Solution


Use the DB_File's B-tree bindings
and supply a comparison function of your own devising:

use DB_File;
# specify the Perl sub to do key comparison using the
# exported $DB_BTREE hash reference
$DB_BTREE->{'compare'} = sub {
my ($key1, $key2) = @_ ;
"\L$key1" cmp "\L$key2" ;
};
tie(%hash, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE)
or die "can't tie $filename: $!";

14.5.3. Description


An annoyance of hashes, whether in memory or as DBM files, is that
they do not maintain proper ordering. The CPAN module Tie::IxHash can
make a regular hash in memory maintain its insertion order, but that
doesn't help for DBM databases or arbitrary sorting criteria.

The DB_File module supports a nice
solution to this using a B-tree implementation. One advantage of a
B-tree over a regular DBM hash is its ordering. When the user defines
a comparison function, all calls to keys,
values, and each are
automatically ordered. For example, Example 14-3 is a
program that maintains a hash whose keys will always be sorted
case-insensitively.

Example 14-3. sortdemo


  #!/usr/bin/perl
#
sortdemo - show auto dbm sorting
use strict;
use DB_File;
$DB_BTREE->{'compare'} = sub {
my ($key1, $key2) = @_ ;
"\L$key1" cmp "\L$key2" ;
};
my %hash;
my $filename = '/tmp/sorthash.db';
tie(%hash, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE)
or die "can't tie $filename: $!";
my $i = 0;
for my $word (qw(Can't you go camp down by Gibraltar)) {
$hash{$word} = ++$i;
}
while (my($word, $number) = each %hash) {
printf "%-12s %d\n", $word, $number;
}

By default, the entries in a B-tree DB_File database are stored
alphabetically. Here, though, we provide a case-insensitive
comparison function, so using each to fetch all
the keys would show:

by           6
camp 4
Can't 1
down 5
Gibraltar 7
go 3
you 2

This sorting property on hashes is so convenient that it's worth
using even without a permanent database. If you pass
undef where the filename is expected on the
tie, DB_File will create a file in
/tmp and then immediately unlink it, giving an
anonymous database:

tie(%hash, "DB_File", undef, O_RDWR|O_CREAT, 0666, $DB_BTREE)
or die "can't tie: $!";

Remember these two things if you supply a comparison for your BTREE
database. One, the new compare function must be specified when you
create the database. Two, you cannot change the ordering once the
database has been created; you must use the same compare function
every time you access the database.

Using BTREE databases under DB_File also permits duplicate or partial
keys. See its documentation for examples.

14.5.4. See Also


Recipe 5.7


/ 875