Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



10.18. Program: Sorting Your Mail




Example 10-1 sorts a mailbox by subject by reading input a
paragraph at a time, looking for one with a "From"
at the start of a line. When it finds one, it searches for the
subject, strips it of any "Re:
" marks, and stores its lowercased version in the
@sub array. Meanwhile, the messages themselves are
stored in a corresponding @msgs array. The
$msgno variable keeps track of the message number.

Example 10-1. bysub1


#!/usr/bin/perl
# bysub1 - simple sort by subject
my(@msgs, @sub);
my $msgno = -1;
$/ = ''; # paragraph reads
while (<>) {
if (/^From/m) {
/^Subject:\s*(?:Re:\s*)*(.*)/mi;
$sub[++$msgno] = lc($1) || '';
}
$msgs[$msgno] .= $_;
}
for my $i (sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msgs)) {
print $msgs[$i];
}

That sort is only sorting array indices. If the
subjects are the same, cmp returns 0, so the
second part of the || is taken, which compares the
message numbers in the order they originally appeared.

If sort were fed a list like
(0,1,2,3), that list would get sorted into a
different permutation, perhaps (2,1,3,0). We
iterate across them with a for loop to print out
each message.

Example 10-2 shows how an awk
programmer might code this program, using the -00 switch to read paragraphs instead of
lines.

Example 10-2. bysub2


#!/usr/bin/perl -n00
# bysub2 - awkish sort-by-subject
INIT { $msgno = -1 }
$sub[++$msgno] = (/^Subject:\s*(?:Re:\s*)*(.*)/mi)[0] if /^From/m;
$msg[$msgno] .= $_;
END { print @msg[ sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msg) ] }

Perl programmers have used parallel arrays
like this since Perl 1. Keeping each message in a hash is a more
elegant solution, though. We'll sort on each field in the hash, by
making an anonymous hash as described in
Chapter 11.

Example 10-3 is a program similar in spirit to Example 10-1 and Example 10-2.

Example 10-3. bysub3


#!/usr/bin/perl -00
# bysub3 - sort by subject using hash records
use strict;
my @msgs = ( );
while (<>) {
push @msgs, {
SUBJECT => /^Subject:\s*(?:Re:\s*)*(.*)/mi,
NUMBER => scalar @msgs, # which msgno this is
TEXT => '',
} if /^From/m;
$msgs[-1]{TEXT} .= $_;
}
for my $msg (sort {
$a->{SUBJECT} cmp $b->{SUBJECT}
||
$a->{NUMBER} <=> $b->{NUMBER}
} @msgs
)
{
print $msg->{TEXT};
}

Once you have real hashes, adding
further sorting criteria is simple. A common way to sort a folder is
subject major, date minor order. The hard part is figuring out how to
parse and compare dates. Date::Manip does this, returning a string
you can compare; however, the datesort program
in
Example 10-4, which uses Date::Manip, runs more
than 10 times slower than the previous one. Parsing dates in
unpredictable formats is extremely slow.

Example 10-4. datesort


#!/usr/bin/perl -00
# datesort - sort mbox by subject then date
use strict;
use Date::Manip;
my @msgs = ( );
while (<>) {
next unless /^From/m;
my $date = '';
if (/^Date:\s*(.*)/m) {
($date = $1) =~ s/\s+\(.*//; # library hates (MST)
$date = ParseDate($date);
}
push @msgs, {
SUBJECT => /^Subject:\s*(?:Re:\s*)*(.*)/mi,
DATE => $date,
NUMBER => scalar @msgs,
TEXT => '',
};
} continue {
$msgs[-1]{TEXT} .= $_;
}
for my $msg (sort {
$a->{SUBJECT} cmp $b->{SUBJECT}
||
$a->{DATE} cmp $b->{DATE}
||
$a->{NUMBER} <=> $b->{NUMBER}
} @msgs
)
{
print $msg->{TEXT};
}

Example 10-4 is written to draw attention to the
continue block. When a loop's end is reached,
either because it fell through to that point or got there from a
next, the whole continue block
is executed. It corresponds to the third portion of a three-part
for loop, except that the
continue block isn't restricted to an expression.
It's a full block, with separate statements.

10.18.1. See Also


The sort function in Chapter 29 of
Programming Perl and in
perlfunc(1); the discussion of the
$/ ($RS,
$INPUT_RECORD_SEPARATOR) variable in Chapter 28 of
Programming Perl, in
perlvar(1), and in the Introduction to
Chapter 8; Recipe 3.7; Recipe 4.16; Recipe 5.10;
Recipe 11.9



10.17. Writing a Switch Statement11. References and Records




Copyright © 2003 O'Reilly & Associates. All rights reserved.

/ 875