Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



1.6. Processing a String One Character at a Time


1.6.1. Problem


You want to process a
string one character at a time.

1.6.2. Solution


Use
split with a null pattern to break up the string
into individual characters, or use unpack if you
just want the characters' values:

@array = split(//, $string); # each element a single character
@array = unpack("U*", $string); # each element a code point (number)

Or extract each character in turn with a loop:

while (/(.)/g) { # . is never a newline here
# $1 has character, ord($1) its number
}

1.6.3. Discussion


As we said before, Perl's fundamental unit is the string, not the
character. Needing to process anything a character at a time is rare.
Usually some kind of higher-level Perl operation, like pattern
matching, solves the problem more handily. See, for example,
Recipe 7.14, where a set of substitutions is used to
find command-line arguments.

Splitting on a pattern that matches the empty string returns a list
of individual characters in the string. This is a convenient feature
when done intentionally, but it's easy to do unintentionally. For
instance, /X*/ matches all possible strings,
including the empty string. Odds are you will find others when you
don't mean to.

Here's an example that prints the characters used in the string
"an apple a
day", sorted in ascending order:

%seen = ( );
$string = "an apple a day";
foreach $char (split //, $string) {
$seen{$char}++;
}
print "unique chars are: ", sort(keys %seen), "\n";
unique chars are: adelnpy

These split and unpack
solutions give an array of characters to work with. If you don't want
an array, use a pattern match with the /g flag in
a while loop, extracting one character at a time:

%seen = ( );
$string = "an apple a day";
while ($string =~ /(.)/g) {
$seen{$1}++;
}
print "unique chars are: ", sort(keys %seen), "\n";
unique chars are: adelnpy

In general, whenever you find yourself doing character-by-character
processing, there's probably a better way to go about it. Instead of
using index and substr or
split and unpack, it might be
easier to use a pattern. Instead of computing a 32-bit checksum by
hand, as in the next example, the unpack function
can compute it far more efficiently.

The following example calculates the checksum of
$string with a foreach loop.
There are better checksums; this just happens to be the basis of a
traditional and computationally easy checksum. You can use the
standard[3] Digest::MD5
module if you want a more robust checksum.

[3]It's standard as of the v5.8 release of
Perl; otherwise, grab it from CPAN.

$sum = 0;
foreach $byteval (unpack("C*", $string)) {
$sum += $byteval;
}
print "sum is $sum\n";
# prints "1248" if $string was "an apple a day"

This does the same thing, but much faster:

$sum = unpack("%32C*", $string);

This emulates the SysV checksum program:

#!/usr/bin/perl
# sum - compute 16-bit checksum of all input files
$checksum = 0;
while (<>) { $checksum += unpack("%16C*", $_) }
$checksum %= (2 ** 16) - 1;
print "$checksum\n";

Here's an example of its use:

% perl sum /etc/termcap
1510

If you have the GNU version of sum, you'll need
to call it with the —sysv
option to get the same answer on the same file.

% sum --sysv /etc/termcap
1510 851 /etc/termcap

Another tiny program that processes its input one character at a time
is slowcat, shown in
Example 1-1. The idea here is to pause after each
character is printed so you can scroll text before an audience slowly
enough that they can read it.

Example 1-1. slowcat


#!/usr/bin/perl
# slowcat - emulate a s l o w line printer
# usage: slowcat [-DELAY] [files ...]
$DELAY = ($ARGV[0] =~ /^-([.\d]+)/) ? (shift, $1) : 1;
$| = 1;
while (<>) {
for (split(//)) {
print;
select(undef,undef,undef, 0.005 * $DELAY);
}
}

1.6.4. See Also


The split and unpack functions
in perlfunc(1) and Chapter 29 of
Programming Perl; the use of expanding
select for timing is explained in
Recipe 3.10



1.5. Using Named Unicode Characters1.7. Reversing a String by Word or Character




Copyright © 2003 O'Reilly & Associates. All rights reserved.

/ 875