Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



4.7. Extracting Unique Elements from a List


4.7.1. Problem




You want to eliminate
duplicate values from a list, such as when you build the list from a
file or from the output of another command. This recipe is equally
applicable to removing duplicates as they occur in input and to
removing duplicates from an array you've already populated.

4.7.2. Solution


Use a hash to record which items have been seen, then
keys to extract them. You can use Perl's idea of
truth to shorten and speed up your code.

4.7.2.1. Straightforward


%seen = ( );
@uniq = ( );
foreach $item (@list) {
unless ($seen{$item}) {
# if we get here, we have not seen it before
$seen{$item} = 1;
push(@uniq, $item);
}
}

4.7.2.2. Faster


%seen = ( );
foreach $item (@list) {
push(@uniq, $item) unless $seen{$item}++;
}

4.7.2.3. Similar but with user function


%seen = ( );
foreach $item (@list) {
some_func($item) unless $seen{$item}++;
}

4.7.2.4. Faster but different


%seen = ( );
foreach $item (@list) {
$seen{$item}++;
}
@uniq = keys %seen;

4.7.2.5. Faster and even more different


%seen = ( );
@uniq = grep { ! $seen{$_} ++ } @list;

4.7.3. Discussion


The question at the
heart of the matter is "Have I seen this element before?" Hashes are
ideally suited to such lookups. The first technique (
Recipe 4.7.2.1) builds up
the array of unique values as we go along, using a hash to record
whether something is already in the array.

The second technique (Recipe 4.7.2.2) is the most
natural way to write this sort of thing in Perl. It creates a new
entry in the hash every time it sees an element that hasn't been seen
before, using the ++ operator. This has the side
effect of making the hash record the number of times the element was
seen. This time we only use the hash for its property of working like
a set.

The third example (Recipe 4.7.2.3) is similar to the
second but rather than storing the item away, we call some
user-defined function with that item as its argument. If that's all
we're doing, keeping a spare array of those unique values is
unnecessary.

The next mechanism (Recipe 4.7.2.4) waits until it's
done processing the list to extract the unique keys from the
%seen hash. This may be convenient, but the
original order has been lost.

The final approach (Recipe 4.7.2.5) merges the
construction of the %seen hash with the extraction
of unique elements. This preserves the original order of elements.

Using a hash to record the values has two side effects: processing
long lists can take a lot of memory, and the list returned by
keys is unordered.

Here's an example of processing input as it is read. We use
`who` to gather information on the current user
list, then extract the username from each line before updating the
hash:

# generate a list of users logged in, removing duplicates
%ucnt = ( );
for (`who`) {
s/\s.*\n//; # kill from first space till end-of-line, yielding username
$ucnt{$_}++; # record the presence of this user
}
# extract and print unique keys
@users = sort keys %ucnt;
print "users logged in: @users\n";

4.7.4. See Also


The "Foreach Loops" section of perlsyn(1) and
Chapter 4 of Programming Perl; the
keys function in perlfunc(1)
and Chapter 29 of Programming Perl; the
"Hashes" section of Chapter 2 of Programming
Perl
;
Chapter 5; we use hashes in a
similar fashion in Recipe 4.8 and Recipe 4.9



4.6. Iterating Over an Array by Reference4.8. Finding Elements in One Array but Not Another




Copyright © 2003 O'Reilly & Associates. All rights reserved.

/ 875